Files
mercury/runtime/mercury_deconstruct.c
Zoltan Somogyi 624aaa01f1 Pack subword-sized arguments next to a local sectag.
compiler/du_type_layout.m:
    If a new option is set, then try to represent function symbols with
    only subword-sized arguments by packing those arguments into the same word
    as the primary tag and (if it is needed) a secondary tag.

    If there are too many such function symbols for the available number of
    bits, pick the ones that need the least number of bits, in order to
    allow us to use this representation for as many such function symbols
    as possible.

    This diff implements this packing only for types that have more than one
    argument, because implementing it for types that have only one argument
    has two extra complications. One is the need for another new cons_id
    (see below), which would make this diff bigger and harder to review.
    The other is the need to consider interactions with the direct_arg
    optimization.

    Don't invoke the code for deciding the representation of arguments
    if either (a) the function symbol has no arguments, or (b) its cons_id
    alone dictates how we will treat its argument (in such cases, there is
    always exactly one).

    Fix a bug in computing the number of bits needed to distinguish N things.

    Store the value of the "experiment" option in the params for now,
    since it has helped track down bugs in this change, and may do the same
    for my next change. It costs next to nothing.

compiler/options.m:
    Add an option that controls whether we allow du_type_layout to pack
    arguments next to local secondary tags. The default value is "no",
    since "yes" may break binary compatibility.

    Add an option that controls whether we allow du_type_layout to pack
    arguments next to remote secondary tags. This option is not yet used.

compiler/hlds_data.m:
    Add a new cons_id, shared_local_tag_with_args, to represent function
    symbols in which the arguments are packed next to a local secondary tag.
    Rename the existing shared_local_tag cons_id as shared_local_tag_no_args,
    to clarify the distinction.

    Redesign the representation of secondary tags a bit, to meet the
    requirements I discovered while implementing the new data representation.

compiler/prog_data.m:
    Document the now-expanded uses of the arg_pos_width type.

compiler/ml_unify_gen.m:
compiler/unify_gen.m:
    Implement unifications involving the new cons_id.

compiler/var_locn.m:
    Implement deconstruction unifications involving both right-to-left data
    flow and the new cons_id for the LLDS backend requires var_locn.m
    to implement a new kind of assignment to a variable: one that updates
    its old value. Add a predicate for this. (Previously, deconstructions
    with right-to-left flow could update the old value of a word in a
    memory cell, whose state var_locn.m does *not* track.)

compiler/code_loc_dep.m:
    Provide the interface between unify_gen. and var_locn.m.

compiler/code_info.m:
    Store the number of primary tag bits in the code_info, to save it looking
    up in the globals structure, since with its new code, unify_gen.m needs it
    more often now.

compiler/hlds_out_module.m:
doc/user_guide.texi:
    Implement the capability of restricting the dump of the type table
    to only the types defined in the module being compiled. Without this,
    the type table is cluttered with information about types in other
    modules, including the automatically-included builtin modules.

compiler/handle_options.m:
    Add a new value of the -D option. The new value, du, asks for the
    dumping out of the representations of only the locally defined types.

compiler/ml_gen_info.m:
    Store the number of primary tag bits as a uint8, not as int.

compiler/ml_tag_switch.m:
compiler/switch_util.m:
compiler/tag_switch.m:
    Update the code that generates switches on du types to handle
    local secondary tags that must be masked off before use.

compiler/rtti.m:
    Update the compiler's representation of RTTI information to account for
    the new data representation.

compiler/type_ctor_info.m:
    Construct the updated RTTI representation.

compiler/bytecode_gen.m:
compiler/export.m:
compiler/ml_switch_gen.m:
compiler/ml_type_gen.m:
compiler/modecheck_goal.m:
compiler/rtti_out.m:
compiler/rtti_to_mlds.m:
    Conform for the changes above.

runtime/mercury_type_info.h:
    Extend the representation of du functors in the RTTI to account for
    the new data representation scheme. The extensions add only to the
    *ends* of structures, or to lists of enum values, with the extensions
    only being used if the representation is actually used, which should
    allow the updated runtime to also work with .c files that were compiled
    with a compiler that does *not* have this diff. For the same reason,
    make the old enum value MR_SECTAG_LOCAL a synonym for the new
    MR_SECTAG_LOCAL_REST_OF_WORD, which expresses a distinction that
    did not previously exist.

    Delete a reference to a file that no longer exists.

runtime/mercury_dotnet.cs.in:
library/rtti_implementation.m:
    Update the C# and Mercury mirrors of the types updated in
    mercury_type_info.h.

runtime/mercury_deconstruct.c:
runtime/mercury_deconstruct_macros.h:
runtime/mercury_ml_expand_body.h:
    Implement the deconstruction of terms using the new data representation.

runtime/mercury_deep_copy_body.h:
    Implement the copying of terms using the new data representation.

runtime/mercury_table_type_body.h:
    Implement the tabling of terms using the new data representation.

runtime/mercury_term_size.c:
    Implement computing the size of terms using the new data representation.

runtime/mercury_unify_compare_body.h:
    Implement RTTI-based unifications of terms using the new data
    representation. (Or at least make a first attempt at this implementation.
    We never use RTTI-based unification, so this code has not been tested,
    but it is not clear that it *needs* to be tested.)

library/construct.m:
    Implement the construction of terms using the new data representation.

library/private_builtin.m:
    List MR_SECTAG_LOCAL_REST_OF_WORD as a synonym of MR_SECTAG_LOCAL for Java,
    since rtti_to_mlds.m will now emit the new version.

    Note that the new data representation is not applicable to Java (or C#),
    so it should never see the other kind of sectag (MR_SECTAG_LOCAL_BITS).

tests/hard_coded/sectag_bits.{m,exp}:
tests/hard_coded/sectag_bits_test_data:
    A new test case to test the reading in and writing out (and therefore
    the construction and deconstruction) of terms containing arguments
    packed with a local sectag.

tests/hard_coded/Mmakefile:
    Enable the new test case.
2018-07-08 17:54:11 +02:00

443 lines
14 KiB
C

// vim: ts=4 sw=4 expandtab ft=c
// Copyright (C) 2002-2007, 2011 The University of Melbourne.
// Copyright (C) 2013-2018 The Mercury team.
// This file is distributed under the terms specified in COPYING.LIB.
// mercury_deconstruct.c
//
// This file provides utility functions for deconstructing terms, for use by
// the standard library.
#include "mercury_imp.h"
#include "mercury_deconstruct.h"
#include "mercury_deconstruct_macros.h"
#include "mercury_type_desc.h"
#include "mercury_minimal_model.h"
// We reserve a buffer to hold the names we dynamically generate
// for "functors" of foreign types. This macro gives its size.
#define MR_FOREIGN_NAME_BUF_SIZE 256
static MR_ConstString MR_expand_type_name(MR_TypeCtorInfo tci, MR_bool);
#define EXPAND_FUNCTION_NAME MR_expand_functor_args
#define EXPAND_TYPE_NAME MR_ExpandFunctorArgsInfo
#define EXPAND_FUNCTOR_FIELD functor
#define EXPAND_ARGS_FIELD args
#include "mercury_ml_expand_body.h"
#undef EXPAND_FUNCTION_NAME
#undef EXPAND_TYPE_NAME
#undef EXPAND_FUNCTOR_FIELD
#undef EXPAND_ARGS_FIELD
#define EXPAND_FUNCTION_NAME MR_expand_functor_args_limit
#define EXPAND_TYPE_NAME MR_ExpandFunctorArgsLimitInfo
#define EXPAND_FUNCTOR_FIELD functor
#define EXPAND_ARGS_FIELD args
#define EXPAND_APPLY_LIMIT
#include "mercury_ml_expand_body.h"
#undef EXPAND_FUNCTION_NAME
#undef EXPAND_TYPE_NAME
#undef EXPAND_FUNCTOR_FIELD
#undef EXPAND_ARGS_FIELD
#undef EXPAND_APPLY_LIMIT
#define EXPAND_FUNCTION_NAME MR_expand_functor_only
#define EXPAND_TYPE_NAME MR_ExpandFunctorOnlyInfo
#define EXPAND_FUNCTOR_FIELD functor_only
#include "mercury_ml_expand_body.h"
#undef EXPAND_FUNCTION_NAME
#undef EXPAND_TYPE_NAME
#undef EXPAND_FUNCTOR_FIELD
#define EXPAND_FUNCTION_NAME MR_expand_args_only
#define EXPAND_TYPE_NAME MR_ExpandArgsOnlyInfo
#define EXPAND_ARGS_FIELD args_only
#include "mercury_ml_expand_body.h"
#undef EXPAND_FUNCTION_NAME
#undef EXPAND_TYPE_NAME
#undef EXPAND_ARGS_FIELD
#define EXPAND_FUNCTION_NAME MR_expand_chosen_arg_only
#define EXPAND_TYPE_NAME MR_ExpandChosenArgOnlyInfo
#define EXPAND_CHOSEN_ARG
#include "mercury_ml_expand_body.h"
#undef EXPAND_FUNCTION_NAME
#undef EXPAND_TYPE_NAME
#undef EXPAND_CHOSEN_ARG
#define EXPAND_FUNCTION_NAME MR_expand_named_arg_only
#define EXPAND_TYPE_NAME MR_ExpandChosenArgOnlyInfo
#define EXPAND_NAMED_ARG
#include "mercury_ml_expand_body.h"
#undef EXPAND_FUNCTION_NAME
#undef EXPAND_TYPE_NAME
#undef EXPAND_NAMED_ARG
// N.B. any modifications to the signature of this function will require
// changes not only to library/deconstruct.m, but also to library/store.m
// and extras/trailed_update/tr_store.m.
MR_bool
MR_arg(MR_TypeInfo type_info, MR_Word *term_ptr, int arg_index,
MR_TypeInfo *arg_type_info_ptr, MR_Word **arg_ptr,
const MR_DuArgLocn **arg_locn_ptr, MR_noncanon_handling noncanon)
{
MR_ExpandChosenArgOnlyInfo expand_info;
MR_expand_chosen_arg_only(type_info, term_ptr, noncanon, arg_index,
&expand_info);
// Check range.
if (expand_info.chosen_index_exists) {
*arg_type_info_ptr = expand_info.chosen_type_info;
*arg_ptr = expand_info.chosen_value_ptr;
*arg_locn_ptr = expand_info.chosen_arg_locn;
return MR_TRUE;
}
return MR_FALSE;
}
MR_bool
MR_named_arg(MR_TypeInfo type_info, MR_Word *term_ptr, MR_ConstString arg_name,
MR_TypeInfo *arg_type_info_ptr, MR_Word **arg_ptr,
const MR_DuArgLocn **arg_locn_ptr, MR_noncanon_handling noncanon)
{
MR_ExpandChosenArgOnlyInfo expand_info;
MR_expand_named_arg_only(type_info, term_ptr, noncanon, arg_name,
&expand_info);
// Check range.
if (expand_info.chosen_index_exists) {
*arg_type_info_ptr = expand_info.chosen_type_info;
*arg_ptr = expand_info.chosen_value_ptr;
*arg_locn_ptr = expand_info.chosen_arg_locn;
return MR_TRUE;
}
return MR_FALSE;
}
MR_bool
MR_named_arg_num(MR_TypeInfo type_info, MR_Word *term_ptr,
const char *arg_name, int *arg_num_ptr)
{
MR_TypeCtorInfo type_ctor_info;
MR_DuTypeLayout du_type_layout;
const MR_DuPtagLayout *ptag_layout;
const MR_DuFunctorDesc *functor_desc;
const MR_NotagFunctorDesc *notag_functor_desc;
MR_Word data;
int ptag;
MR_Word sectag;
MR_TypeInfo eqv_type_info;
int i;
type_ctor_info = MR_TYPEINFO_GET_TYPE_CTOR_INFO(type_info);
if (! MR_type_ctor_has_valid_rep(type_ctor_info)) {
MR_fatal_error("MR_named_arg_num: term of unknown representation");
}
switch (MR_type_ctor_rep(type_ctor_info)) {
case MR_TYPECTOR_REP_DU_USEREQ:
case MR_TYPECTOR_REP_DU:
data = *term_ptr;
du_type_layout = MR_type_ctor_layout(type_ctor_info).MR_layout_du;
ptag = MR_tag(data);
ptag_layout = &du_type_layout[ptag];
switch (ptag_layout->MR_sectag_locn) {
case MR_SECTAG_NONE:
case MR_SECTAG_NONE_DIRECT_ARG:
functor_desc = ptag_layout->MR_sectag_alternatives[0];
break;
case MR_SECTAG_LOCAL_REST_OF_WORD:
sectag = MR_unmkbody(data);
functor_desc = ptag_layout->MR_sectag_alternatives[sectag];
break;
case MR_SECTAG_LOCAL_BITS:
sectag = MR_unmkbody(data) &
((1 << ptag_layout->MR_sectag_numbits) - 1);
functor_desc = ptag_layout->MR_sectag_alternatives[sectag];
break;
case MR_SECTAG_REMOTE:
sectag = MR_field(ptag, data, 0);
functor_desc = ptag_layout->MR_sectag_alternatives[sectag];
break;
case MR_SECTAG_VARIABLE:
MR_fatal_error("MR_named_arg_num(): unexpected variable");
default:
MR_fatal_error("MR_named_arg_num(): invalid sectag_locn");
}
if (functor_desc->MR_du_functor_arg_names == NULL) {
return MR_FALSE;
}
for (i = 0; i < functor_desc->MR_du_functor_orig_arity; i++) {
if (functor_desc->MR_du_functor_arg_names[i] != NULL
&& MR_streq(arg_name,
functor_desc->MR_du_functor_arg_names[i]))
{
*arg_num_ptr = i;
return MR_TRUE;
}
}
return MR_FALSE;
case MR_TYPECTOR_REP_EQUIV:
eqv_type_info = MR_create_type_info(
MR_TYPEINFO_GET_FIXED_ARITY_ARG_VECTOR(type_info),
MR_type_ctor_layout(type_ctor_info).MR_layout_equiv);
return MR_named_arg_num(eqv_type_info, term_ptr, arg_name,
arg_num_ptr);
case MR_TYPECTOR_REP_EQUIV_GROUND:
eqv_type_info = MR_pseudo_type_info_is_ground(
MR_type_ctor_layout(type_ctor_info).MR_layout_equiv);
return MR_named_arg_num(eqv_type_info, term_ptr, arg_name,
arg_num_ptr);
case MR_TYPECTOR_REP_NOTAG:
case MR_TYPECTOR_REP_NOTAG_USEREQ:
case MR_TYPECTOR_REP_NOTAG_GROUND:
case MR_TYPECTOR_REP_NOTAG_GROUND_USEREQ:
notag_functor_desc = MR_type_ctor_functors(type_ctor_info).
MR_functors_notag;
if (notag_functor_desc->MR_notag_functor_arg_name != NULL
&& MR_streq(arg_name,
notag_functor_desc->MR_notag_functor_arg_name))
{
*arg_num_ptr = 0;
return MR_TRUE;
}
return MR_FALSE;
case MR_TYPECTOR_REP_ENUM:
case MR_TYPECTOR_REP_ENUM_USEREQ:
case MR_TYPECTOR_REP_DUMMY:
case MR_TYPECTOR_REP_INT:
case MR_TYPECTOR_REP_UINT:
case MR_TYPECTOR_REP_INT8:
case MR_TYPECTOR_REP_UINT8:
case MR_TYPECTOR_REP_INT16:
case MR_TYPECTOR_REP_UINT16:
case MR_TYPECTOR_REP_INT32:
case MR_TYPECTOR_REP_UINT32:
case MR_TYPECTOR_REP_INT64:
case MR_TYPECTOR_REP_UINT64:
case MR_TYPECTOR_REP_FLOAT:
case MR_TYPECTOR_REP_CHAR:
case MR_TYPECTOR_REP_STRING:
case MR_TYPECTOR_REP_BITMAP:
case MR_TYPECTOR_REP_FUNC:
case MR_TYPECTOR_REP_PRED:
case MR_TYPECTOR_REP_SUBGOAL:
case MR_TYPECTOR_REP_VOID:
case MR_TYPECTOR_REP_C_POINTER:
case MR_TYPECTOR_REP_STABLE_C_POINTER:
case MR_TYPECTOR_REP_TYPEINFO:
case MR_TYPECTOR_REP_TYPECTORINFO:
case MR_TYPECTOR_REP_TYPEDESC:
case MR_TYPECTOR_REP_TYPECTORDESC:
case MR_TYPECTOR_REP_PSEUDOTYPEDESC:
case MR_TYPECTOR_REP_TYPECLASSINFO:
case MR_TYPECTOR_REP_BASETYPECLASSINFO:
case MR_TYPECTOR_REP_SUCCIP:
case MR_TYPECTOR_REP_HP:
case MR_TYPECTOR_REP_CURFR:
case MR_TYPECTOR_REP_MAXFR:
case MR_TYPECTOR_REP_REDOFR:
case MR_TYPECTOR_REP_REDOIP:
case MR_TYPECTOR_REP_TICKET:
case MR_TYPECTOR_REP_TRAIL_PTR:
case MR_TYPECTOR_REP_REFERENCE:
case MR_TYPECTOR_REP_TUPLE:
case MR_TYPECTOR_REP_ARRAY:
case MR_TYPECTOR_REP_FOREIGN:
case MR_TYPECTOR_REP_STABLE_FOREIGN:
case MR_TYPECTOR_REP_FOREIGN_ENUM:
case MR_TYPECTOR_REP_FOREIGN_ENUM_USEREQ:
case MR_TYPECTOR_REP_UNUSED1:
case MR_TYPECTOR_REP_UNUSED2:
case MR_TYPECTOR_REP_UNKNOWN:
return MR_FALSE;
}
MR_fatal_error("MR_named_arg_num: unexpected fallthrough");
}
static MR_ConstString
MR_expand_type_name(MR_TypeCtorInfo tci, MR_bool wrap)
{
MR_String str;
int len;
len = 0;
len += strlen(tci->MR_type_ctor_module_name);
len += 1; // '.'
len += strlen(tci->MR_type_ctor_name);
len += 1; // '/'
len += 4; // arity; we do not support arities above 1024
if (wrap) {
len += 4; // <<>>
}
len += 1; // NULL
if (tci->MR_type_ctor_arity > 9999) {
MR_fatal_error("MR_expand_type_name: arity > 9999");
}
MR_restore_transient_hp();
MR_allocate_aligned_string_msg(str, len, MR_ALLOC_SITE_STRING);
MR_save_transient_hp();
sprintf(str, wrap? "<<%s.%s/%d>>" : "%s.%s/%d",
tci->MR_type_ctor_module_name,
tci->MR_type_ctor_name,
(int) tci->MR_type_ctor_arity);
return (MR_ConstString) str;
}
MR_Word
MR_arg_value_uncommon(MR_Word *arg_ptr, const MR_DuArgLocn *arg_locn)
{
MR_Word val;
// The meanings of the various special values of MR_arg_bits
// are documented next to the definition of the MR_DuArgLocn type
// in mercury_type_info.h.
switch (arg_locn->MR_arg_bits) {
case -1:
// MR_arg_bits == -1 means the argument is a double-precision
// floating point value occupying two words.
#ifdef MR_BOXED_FLOAT
{
MR_Float flt;
flt = MR_float_from_dword(arg_ptr[0], arg_ptr[1]);
#ifdef MR_HIGHLEVEL_CODE
return (MR_Word) MR_box_float(flt);
#else
return MR_float_to_word(flt);
#endif
}
#else
MR_fatal_error("double-word floats should not exist in this grade");
#endif
case -2:
// MR_arg_bits == -2 means the argument is an int64 value
// occupying two words.
#if defined(MR_BOXED_INT64S)
{
int64_t i64;
i64 = MR_int64_from_dword(arg_ptr[0], arg_ptr[1]);
#ifdef MR_HIGHLEVEL_CODE
return (MR_Word) MR_box_int64(i64);
#else
return MR_int64_to_word(i64);
#endif
}
#else
MR_fatal_error("double-word int64s should not exist in this grade");
#endif
case -3:
// MR_arg_bits == -3 means the argument is a uint64 value
// occupying two words.
#if defined(MR_BOXED_INT64S)
{
uint64_t ui64;
ui64 = MR_uint64_from_dword(arg_ptr[0], arg_ptr[1]);
#ifdef MR_HIGHLEVEL_CODE
return (MR_Word) MR_box_uint64(ui64);
#else
return MR_uint64_to_word(ui64);
#endif
}
#else
MR_fatal_error("double-word uint64s should not exist in this grade");
#endif
case -4:
// MR_arg_bits == -4 means the argument is an int8 value
// occupying part of one word.
val = *arg_ptr;
val = (val >> arg_locn->MR_arg_shift) & ((MR_Word) 0xff);
val = (MR_Word) (int8_t) val;
return val;
case -5:
// MR_arg_bits == -5 means the argument is a uint8 value
// occupying part of one word.
val = *arg_ptr;
val = (val >> arg_locn->MR_arg_shift) & ((MR_Word) 0xff);
val = (MR_Word) (uint8_t) val;
return val;
case -6:
// MR_arg_bits == -6 means the argument is an int16 value
// occupying part of one word.
val = *arg_ptr;
val = (val >> arg_locn->MR_arg_shift) & ((MR_Word) 0xffff);
val = (MR_Word) (int16_t) val;
return val;
case -7:
// MR_arg_bits == -7 means the argument is a uint16 value
// occupying part of one word.
val = *arg_ptr;
val = (val >> arg_locn->MR_arg_shift) & ((MR_Word) 0xffff);
val = (MR_Word) (uint16_t) val;
return val;
case -8:
// MR_arg_bits == -8 means the argument is an int32 value
// occupying part of one word.
val = *arg_ptr;
val = (val >> arg_locn->MR_arg_shift) & ((MR_Word) 0xffffffff);
val = (MR_Word) (int32_t) val;
return val;
case -9:
// MR_arg_bits == -9 means the argument is a uint32 value
// occupying part of one word.
val = *arg_ptr;
val = (val >> arg_locn->MR_arg_shift) & ((MR_Word) 0xffffffff);
val = (MR_Word) (uint32_t) val;
return val;
case -10:
// MR_arg_bits == -10 means the argument is of a dummy type.
return 0;
default:
if (arg_locn->MR_arg_bits > 0) {
// The argument is a packed enumeration value.
val = *arg_ptr;
val = (val >> arg_locn->MR_arg_shift)
& ((MR_Word) (1 << arg_locn->MR_arg_bits) - 1);
return val;
} else {
// If MR_arg_bits is exactly zero, this function
// should not have been called at all (since that is
// the *common* case).
MR_fatal_error("unexpected value of MR_arg_bits");
}
}
}