Files
mercury/bytecode/Bytecode-doc
1997-02-04 02:24:05 +00:00

359 lines
10 KiB
Plaintext

Summary of types
----------------
- byte
unsigned char 0-255
- cstring
Sequence of non-zero bytes terminated by zero-byte.
XXX: May change this later to allow embedded
zero-bytes in strings.
- short
2 bytes interpreted as an unsigned short.
MSB is first byte read.
2's complement.
- int
4 bytes interpreted as a signed int
MSB read first.
2's complement
- float
XXX: not yet supported but presumably..
4 bytes interpreted as float.
MSB read first.
Must be IEEE float format.
- list of T
- contiguous sequence of T
- determinism
one byte interpreted as follows
- 0 = det
- 1 = semidet
- 2 = multidet
- 3 = nondet
- 4 = cc_multidet
- 5 = cc_nondet
- 6 = erroneous
- 7 = failure
- tag is one of:
% XXX Need explanation of all these.
- 0 (byte) (simple tag)
- primary (byte)
- 1 (byte) (complicated tag)
- primary (byte)
- secondary (int)
- 2 (byte) (complicated constant tag)
- primary (byte)
- secondary (int)
- 3 (byte) (enum tag)
Enumeration of pure constants.
- 4 (byte) (no_tag)
- cons_id (constructor id) is one of:
% Note that not all of these alternatives are
% meaningful in all bytecodes that have arguments of
% type cons_id. XXX: Specify exactly which cases
% are meaningful.
- 0 (byte) (cons)
- functor name (cstring)
- arity (short)
- tag (tag)
- 1 (byte) (int const)
- integer constant (int)
- 2 (byte) (string const)
- string constant (cstring) XXX: no '\0' in strings!
- 3 (byte) (float const)
- float constant (float) XXX: not yet supported
- 4 (byte) (pred const)
- module id (cstring)
- predicate id (cstring)
- arity (short)
- procedure id (byte)
- 5 (byte) (code addr const)
- module id (cstring)
- predicate id (cstring)
- arity (short)
- procedure id (byte)
- 6 (byte) (base type info const)
- module id (cstring)
- type name (cstring)
- type arity (byte)
- op_arg (argument to an operator) is one of:
- 0 (byte)
- variable slot (short)
- 1 (byte)
- integer constant (int)
- 2 (byte)
- float constant (float) XXX: not yet supported
- dir (direction of information movement in general unification)
is one of:
- 0 (byte) to_arg
- 1 (byte) to_var
- 2 (byte) to_none
Summary of Bytecodes
--------------------
% Note: Currently we specify only the static layout of bytecodes.
% We also need to specify the operational semantics of the bytecodes,
% which can be done by specifying state transitions on the abstract
% machine. That is, to specify the meaning of a bytecode, we simply
% say how the state of the abstract machine has changed from before
% interpreting the bytecode to after interpreting the bytecode.
- enter_pred (0)
- predicate name (cstring)
- number of procedures in predicate (short)
- endof_pred (1)
- enter_proc (2)
- procedure id (byte) XXX: should use short instead?
procedure id is used to distinguish the procedures
in a predicate.
- determinism of the procedure (determinism)
- label count (short)
Number of labels in the procedure. Used for allocating a
table of labels in the interpreter.
- temp count (short)
Number of temporary variables needed for this procedure. (?)
- length of list (short)
Number of items in next arg
- list of
- Variable info (cstring)
XXX: we should also have typeinfo for each variable.
- endof_proc (3)
- label (4)
- Code label. (short)
Used for jumps, switches, if-then-else, etc.
- enter_disjunction (5)
- label id (short)
Label refers to the label immediately after the disjunction.
- endof_disjunction (6)
- enter_disjunct (7)
- label id (short)
Label refers to label for next disjunct.
- endof_disjunct (8)
- label id (short)
Label refers to label for next disjunct.(?)
Is -1 if there is no next disjunct in this disjunction.
- enter_switch (9)
- variable in slots on which we are switching (short)
- label immediately after the switch (short)
We jump to the label after we've performed the switch.
label refers to label immediately after corresponding
endof_switch.
- endof_switch (10)
- enter_switch_arm (11)
- constructor id (cons_id)
- label id (short)
label refers to label for next switch arm.
- endof_switch_arm (12)
- label id (short)
Label id refers to label immediately before next switch arm.
(?)
- enter_if (13)
- else label id (short)
- follow label id (short)
label refers to label at endof_if
Note that we must've pushed a failure context
before entering the enter_if. If the condition
fails, we follow the failure context.
- frame pointer tmp (short)
XXX: hmm... dunno..
- enter_then (14)
- frame pointer temp (short)
XXX: what's this for?
XXX: should have flag here? [I wrote this note in a meeting.
What in hell did I mean?]
- endof_then (15) XXX: enter_else is a better name.
- follow label (short)
XXX: label just before endof_if ???
- endof_if (16)
- enter_negation (17)
- label id (short)
label refers to label at endof_negation.
Note: As with if-then-else, we must push a failure
context just before entering enter_negation. If the
negation fails, we follow the failure context.
- endof_negation (18)
- enter_commit (19)
- temp (short)
XXX: what's this for?
% XXX: how does this work?
- endof_commit (20)
- temp (short)
XXX: what's this for?
- assign (21)
- Variable A in slots (short)
- Variable B in slots (short)
A := B. Copy contents of slot B to slot A.
- test (22)
- Variable A in slots (short)
- Variable B in slots (short)
Used to test atomic values (int, float, etc). Before entering
test, a failure context must be pushed. If the test fails,
the failure context is followed.
- construct (23)
- variable slot (short)
- constructor id (cons_id)
- list length of next arg (short)
- list of:
- variable slot (short)
Apply constructor to list of arguments (in list of variable slots)
and store result in a variable slot.
- deconstruct (24)
- variable slot Var (short)
- constructor id (cons_id)
- list length of next arg (short)
- list of:
- variable slot (short)
If cons_id is:
- a functor applied to some args, then remove functor
and put args into variable slots.
- an integer constant, then check for equality of
the constant and the value in the variable slot
- a float constant, then check for equality of
the constant and the value in the variable slot.
- anything else, then makes no sense and interpreter
should raise error. XXX: correct?
Note: We must push a failure context before entering deconstruct. If the deconstruct fails (i.e. functor of Var isn't
the same as cons_id, or ints are not equal, or floats are
not equal), then we must follow the failure context.
- complex_construct (25)
- var (short)
- cons id (cons_id)
- list length (short)
- list of:
- var (short)
- direction (dir)
This used for general unification using partially instantiated
terms. This is made possible by bromage's aliasing work.
- complex_deconstruct (26)
- variable slot (short)
- constructor id (cons_id)
- list length of next arg (short)
- list of
- variable slot (short)
- direction (dir)
Note: This is a generalised deconstruct. The directions specify
which way bindings move. XXX: This is still not 100% crystal clear.
- place_arg (27)
- register number (byte)
XXX: Do we have at most 256 registers?
- variable number (short)
Move number from variable slot to register.
(Note: See notes for pickup_arg.)
XXX: We will need to #include imp.h from ther Mercury runtime,
since this specifies the usage of registers. For example, we
need to know whether we're using the compact or non-compact
register allocation method for parameter passing. (The compact
method reuses input registers as output registers. In the
non-compact mode, input and output registers are distinct.)
- pickup_arg (28)
- register number (byte)
- variable number in variable slots (short)
Move argument from register to variable slot.
(Note: We currently don't make use of floating-point registers.
The datatype for pickup_arg in the bytecode generator allows
for distinguishing register `types', that is floating-point
register or normal registers. We may later want to spit out
another byte `r' or `f' to identify the type of register.)
- call (29)
- module id (cstring)
- predicate id (cstring)
- arity (short)
- procedure id (byte)
XXX: If we call a Mercury builtin, the module name is `mercury_builtin'.
What if the user has a module called mercury_builtin?
- higher_order_call (30)
- var (short)
- input variable count (short)
- output variable count (short)
- determinism (determinism)
- builtin_binop (31)
- binary operator (byte)
This single byte is an index into a table of binary
operators.
- argument to binary operator (op_arg)
- another argument to binary operator (op_arg)
- variable slot which receives result of binary operation (short)
XXX: Floating point operations must be distinguished from
int operations. In the interpreter, we should use a lookup table
that maps bytes to the operations.
- builtin_unop (32)
- unary operator (byte)
An index into a table of unary operators.
- argument to unary operator (op_arg)
- variable slot which receives result of unary operation (short)
- builtin_bintest (33)
- binary operator (byte)
An index into a table of binary operators.
- argument to binary test (op_arg)
- another argument to binary test op_arg)
Note we must first push a choice point which we may follow should
the test fail.
- builtin_untest (34)
- unary operator (byte)
An index into a tabler of unary operators.
- argument to unary operator (op_arg)
Note we must first push a choice point which we may follow should
the test fail.
- semidet_succeed (35)
- semidet_success_check (36)
- fail (37)
- context (38)
- line number in Mercury source that the current bytecode
line corresponds to. (short)
XXX: Still not clear how we should implement `step' in a debugger
since a single context may have other contexts interleaved in it.
- not_supported (39)
Some unsupported feature is used. Inline C in Mercury code,
for instance. Any procedure thatr contains inline C
(or is compiled Mercury?) must have the format:
enter_pred ...
not_supported
endof_pred