Commit Graph

309 Commits

Author SHA1 Message Date
Zoltan Somogyi
67a4ad3cac Make mlds_label a notag type.
compiler/mlds.m:
    Switch mlds_label from an equivalent type to a notag type.

compiler/ml_code_util.m:
compiler/ml_gen_info.m:
compiler/ml_proc_gen.m:
compiler/mlds_dump.m:
compiler/mlds_to_c_stmt.m:
    Conform to the change above.
2023-05-16 13:25:19 +10:00
Zoltan Somogyi
fb4b23091a Don't pass Indent when it is always zero.
compiler/mlds_to_cs_file.m:
compiler/mlds_to_java_file.m:
    Delete the Indent argument from the top level predicates
    which always get passed Indent=0. We are not likely to ever pass
    any other indent value.

    Inline a function that has only one call site.

    In mlds_to_java_file.m, write out the comment with the module name
    regardless of whether auto-comments are enabled.

    In mlds_to_java_file.m, put a blank line after the comment block
    listing all the imported modules.

compiler/ml_type_gen.m:
    Use // comments in sample C code.

compiler/mlds.m:
    Avoid some repeated deconstructs.

compiler/mlds_to_cs_util.m:
compiler/mlds_to_java_util.m:
    Always put a space after an inline comment.

    Make the name of the predicate writing out inline comments
    more descriptive.

compiler/mlds_to_cs_data.m:
compiler/mlds_to_java_data.m:
    Conform to the change above.
2023-05-08 22:02:10 +10:00
Zoltan Somogyi
39eb9e93b0 Fix/update comments in mlds_to_{cs,java}_stmt.m.
compiler/mlds_to_cs_data.m:
compiler/mlds_to_java_data.m:
    Fix comments about the handling of rvals representing enums,
    addressing Julien's review comments. In mlds_to_cs_data.m,
    simplify the code the comments are about.

    Separate out and simplify the code handling bitwise complements.

compiler/ml_type_gen.m:
    Fix some comments.

    Some comments were out-of-date. Update the ones I could update;
    mark the others as out-of-date to warn readers.

compiler/mlds.m:
    Expand some comments, and XXX about a possible improvement.

compiler/mlds_to_java_class.m:
    Make clear that updates to a global data structure are used only locally.

compiler/mlds_to_java_stmt.m:
    Use more standard variable names.
2023-05-08 11:11:32 +10:00
Zoltan Somogyi
3143060529 Get mlds_to_{cs,java}_type.m closer to each other ...
... and simplify them.

compiler/mlds_to_cs_type.m:
compiler/mlds_to_java_type.m:
    These two modules contain very similar code, because C# is a near-copy
    of Java, and therefore the MLDS->C# translator was created as a near-copy
    of the MLDS->Java translator. However, they have diverged over time.
    This diff makes them resemble each other more closely again. It does this
    partly by putting the contents of both files into the same order, and
    partly by fixing unnecessary differences between them (e.g. in what code
    is in auxiliary functions and what code is inline).

    It also simplifies things in several respects.

    First, it deletes the output_type_for_{csharp,java}_dims predicates
    after inlining them at their respective only call sites.

    Second, while it keeps the old predicate in each module that converted
    an MLDS type to a C#/Java type name and a list of array of dimensions
    (though in a renamed form), it also adds a function wrapper that returns
    the full type name, with the array dimensions applied, because this is
    what most of its callers actually want.

    Third, don't pass a type constructor category to the predicates now named
    mercury_user_type_to_string_and_dims_for_{csharp,java} when what they need
    is an mlds_class_kind; pass the mlds_class_kind directly.

    Fourth, type_is_array_for_{csharp,java} both used to be defined
    using if-then-else chains that tested for a few kinds of mlds_types,
    and treated the rest the same. This code was vulnerable to not being
    updated when new kinds of mlds_types were added. One of these predicates
    had one bug, while the other had two. The bug they shared was that both
    returned "not_array" for all mlds_tabling_types, even though some tabling
    types *are* arrays. This was inconsequential, since neither target
    supports tabling now. The other bug was that type_is_array_for_java
    also returned "not_array" for mlds_mostly_generic_array_type, which
    is always an array.

    Besides these actual bugs, these functions also demanded double
    maintenance, since their semantics required them to return "is_array"
    if and only if type_to_string_and_dims_for_{csharp,java} returned
    a nonempty list of array dimensions. Eliminate this maintenance burden
    by deleting both functions, and making all their callers call a new version
    of output_type_for_{csharp,java} instead. In all but one situation
    (repeated for C# and Java), the callers of type_is_array_for_{csharp,java}
    called output_type_for_{csharp,java} just before already, so this
    new version, besides printing the target-language type name, also returns
    the list of array dimensions that is one of its intermediate products.

compiler/mlds_to_target_util.m:
    Update init_arg_wrappers_cs_java to operate on a list of array dimensions,
    to conform to the change just above.

    Add a function to convert a list of array dimensions to a string,
    to allow mlds_to_{cs,java}_type.m to construct type names as strings
    without immediately writing out that string. Document the related
    functions.

    Add a predicate to fix the size of an array in one dimension. Its body
    is code that used to be repeated in mlds_to_{cs,java}_type.m.

    Simplify some other code.

compiler/mlds_to_cs_class.m:
compiler/mlds_to_cs_data.m:
compiler/mlds_to_cs_export.m:
compiler/mlds_to_cs_func.m:
compiler/mlds_to_cs_global.m:
compiler/mlds_to_cs_stmt.m:
compiler/mlds_to_java_class.m:
compiler/mlds_to_java_data.m:
compiler/mlds_to_java_export.m:
compiler/mlds_to_java_func.m:
compiler/mlds_to_java_global.m:
compiler/mlds_to_java_stmt.m:
    Conform to the changes above.

    These modules had several pieces of code that converted an MLDS type
    to a C#/Java type name and a list of array of dimensions, but then
    ignored the dimensions. In every case, this was ok, because the MLDS type
    being converted was one whose C#/Java form had no array dimensions,
    but it still looked strange. The replacement code calls the
    type_to_string_for_{csharp,java} functions, which does not have this
    problem.

    In mlds_to_java_stmt.m, replace two switches on the same value,
    with a block of code between them, with a single switch, in which
    each arm calls a predicate whose body used to be the sandwiched-between
    block. Unlike the old structure, the new structure makes it easy to see
    that the many parentheses in the generated code are balanced.

    Add XXXs about some dubious aspects of the code.

compiler/mlds.m:
    Add a table listing, for each Mercury builtin type, the name of the
    C, Java and C# types that we use to represent values of that type.
    The creation of this table was the initial impetus for the
    trip-down-the-rabbit-hole described above :-)

compiler/prog_type.m:
    Put related functors next to each other.
2023-05-04 02:22:33 +10:00
Zoltan Somogyi
49c70a741b Improve formatting. 2023-05-02 02:24:01 +10:00
Zoltan Somogyi
f7442dbb1a Change some "arity"s to "pred_form_arity"s.
compiler/hlds_rtti.m:
compiler/mlds.m:
    As above.

compiler/add_pragma_tabling.m:
compiler/ml_accurate_gc.m:
compiler/ml_code_util.m:
compiler/ml_elim_nested.m:
compiler/ml_util.m:
compiler/mlds_dump.m:
compiler/mlds_to_c_class.m:
compiler/mlds_to_c_data.m:
compiler/mlds_to_c_file.m:
compiler/mlds_to_c_func.m:
compiler/mlds_to_c_name.m:
compiler/mlds_to_c_stmt.m:
compiler/mlds_to_cs_name.m:
compiler/mlds_to_java_name.m:
compiler/proc_label.m:
compiler/rtti_to_mlds.m:
    Conform to the changes above.

    Simplify some code that does output.

    Delete some no-longer-relevant comments.
2022-10-06 09:11:11 +11:00
Zoltan Somogyi
3eb0054658 Add and/or fix comments. 2022-08-04 08:56:38 +10:00
Zoltan Somogyi
aa372b915f Fix indentation. 2021-12-31 00:06:34 +11:00
Zoltan Somogyi
09aff57259 Represent "is this a subtype" using a bespoke type.
compiler/prog_data.m:
compiler/hlds_data.m:
    Use a new type, maybe_subtype, to say whether a du type is a subtype.

compiler/add_foreign_enum.m:
compiler/add_special_pred.m:
compiler/add_type.m:
compiler/check_parse_tree_type_defns.m:
compiler/comp_unit_interface.m:
compiler/decide_type_repn.m:
compiler/du_type_layout.m:
compiler/equiv_type.m:
compiler/equiv_type_hlds.m:
compiler/hlds_out_module.m:
compiler/ml_type_gen.m:
compiler/mlds.m:
compiler/module_qual.qualify_items.m:
compiler/parse_tree_out.m:
compiler/parse_type_defn.m:
compiler/prog_type.m:
compiler/recompilation.usage.m:
compiler/switch_util.m:
compiler/table_gen.m:
compiler/type_ctor_info.m:
compiler/type_util.m:
compiler/unify_proc.m:
compiler/unused_imports.m:
    Conform to the changes above.
2021-07-09 09:25:04 +10:00
Zoltan Somogyi
254cd500bf Add bespoke type for du types' details.
compiler/hlds_data.m:
    As above. The other kinds of types already had bespoke types
    for *their* details.

compiler/add_type.m:
compiler/du_type_layout.m:
    Instead of passing values of the hlds_type_body with an inst
    that said they were du types, pass values of the new types instead,
    which is significantly simpler.

compiler/add_foreign_enum.m:
compiler/add_special_pred.m:
compiler/check_typeclass.m:
compiler/code_info.m:
compiler/dead_proc_elim.m:
compiler/det_report.m:
compiler/direct_arg_in_out.m:
compiler/equiv_type_hlds.m:
compiler/foreign.m:
compiler/hlds_out_module.m:
compiler/inst_check.m:
compiler/intermod.m:
compiler/ml_type_gen.m:
compiler/ml_unify_gen_test.m:
compiler/ml_unify_gen_util.m:
compiler/mlds.m:
compiler/mode_util.m:
compiler/post_term_analysis.m:
compiler/recompilation.usage.m:
compiler/resolve_unify_functor.m:
compiler/simplify_goal_ite.m:
compiler/special_pred.m:
compiler/switch_util.m:
compiler/table_gen.m:
compiler/term_norm.m:
compiler/type_ctor_info.m:
compiler/type_util.m:
compiler/typecheck.m:
compiler/unify_proc.m:
compiler/untupling.m:
compiler/unused_imports.m:
compiler/xml_documentation.m:
    Conform to the changes above.
2021-07-01 08:26:04 +10:00
Zoltan Somogyi
0d7c8a7654 Specify pred or func for all pragmas.
*/*.m:
    As above.

configure.ac:
    Require the installed compiler to support this capability.
2021-06-16 15:23:58 +10:00
Zoltan Somogyi
d273494121 Rename a functor to avoid ambiguity. 2021-05-17 18:52:24 +10:00
Peter Wang
e2b5ba8884 Make subtypes share high-level data representation with base type.
In the high-level data representation, make a subtype term be
represented using the class corresponding to the base type constructor
instead of its own class. This is necessary to be able to downcast
a term from a type to a subtype in Java and C#.

compiler/du_type_layout.m:
    Move get_base_type_ctor predicate to type_util.m.

    Abort in a couple of places that should not occur.

compiler/type_util.m:
    Add get_base_type_ctor predicate.

compiler/globals.m:
    Add compilation_target_high_level_data predicate.

compiler/lco.m:
    Use compilation_target_high_level_data predicate.

compiler/ml_type_gen.m:
    When using the high-level data representation,
    don't generate a MLDS type definition (class) for a subtype.

compiler/mlds.m:
    When using the high-level data representation,
    replace a Mercury subtype with its base type in an mlds_type.

    Move foreign_type_to_mlds_type.

compiler/ml_unify_gen_util.m:
    To access a field when using the high-level data representation,
    use field names from the base type constructor of a subtype.

compiler/unify_proc.m:
    When using the high-level data representation,
    generate unify/compare procs for subtypes that just call the
    unify/compare proc for the base type constructor.

compiler/options.m:
    Delete references to --high-level and --high-level-data.

---------------

runtime/mercury_type_info.h:
    Document a new field MR_type_ctor_base in MR_TypeCtorInfo_Struct.
    The field is unnecessary and does not exist in the
    MR_TypeCtorInfo_Struct for C.

runtime/mercury_dotnet.cs.in:
    Add type_ctor_base member to MR_TypeCtorInfo_Struct for C#.

java/runtime/TypeCtorInfo_Struct.java
    Add type_ctor_base member to MR_TypeCtorInfo_Struct for Java.

compiler/rtti.m:
compiler/type_ctor_info.m:
    Add field corresponding to MR_type_ctor_base in type_ctor_details
    for enum, notag and general du types.

compiler/rtti_to_mlds.m:
    Initialize the MR_type_ctor_base field in type_ctor_infos
    for high-level data grades.

compiler/rtti_out.m:
    Don't write out the MR_type_ctor_base field when using
    the low-level data representation.

library/rtti_implementation.m:
    In Java and C# grades (high-level data grades), use the
    MR_type_ctor_base field to get the type_ctor_info of the base type
    ctor when constructing or deconstructing terms of a subtype.
    It is necessary to perform reflection using class and field names
    from the base type constructor since there are no classes
    corresponding to subtypes.

    Clean up some code.

---------------

tests/hard_coded/Mmakefile:
tests/hard_coded/subtype_abstract.m:
tests/hard_coded/subtype_abstract_2.m:
tests/hard_coded/subtype_abstract.exp:
    Add a test case.

tests/hard_coded/subtype_rtti.m:
tests/hard_coded/subtype_rtti.exp2:
    Enable a test that was previously skipped in Java and C# grades.
2021-04-09 17:41:23 +10:00
Zoltan Somogyi
a19a5f0267 Delete the Erlang backend from the compiler.
compiler/elds.m:
compiler/elds_to_erlang.m:
compiler/erl_backend.m:
compiler/erl_call_gen.m:
compiler/erl_code_gen.m:
compiler/erl_code_util.m:
compiler/erl_rtti.m:
compiler/erl_unify_gen.m:
compiler/erlang_rtti.m:
compiler/mercury_compile_erl_back_end.m:
    Delete these modules, which together constitute the Erlang backend.

compiler/notes/compiler_design.html:
    Delete references to the deleted modules.

compiler/parse_tree_out_type_repn.m:
    Update the format we use to represent the sets of foreign_type and
    foreign_enum declarations for a type as part of its item_type_repn_info,
    now that Erlang is no longer a target language.

compiler/parse_type_repn.m:
    Accept both the updated version of the item_type_repn_info and the
    immediately previous version, since the installed compiler will
    initially generate that previous version. However, stop accepting
    an even older version that we stopped generating several months ago.

compiler/parse_pragma_foreign.m:
    When the compiler finds a reference to Erlang as a foreign language,
    add a message about support for Erlang being discontinued to the error
    message.

    Make the code parsing foreign_decls handle the term containing
    the foreign language the same way as the codes parsing foreign
    codes, procs, types and enums.

    Add a mechanism to help parse_mutable.m to do the same.

compiler/parse_mutable.m:
    When the compiler finds a reference to Erlang as a foreign language,
    print an error message about support for Erlang being discontinued.

compiler/compute_grade.m:
    When the compiler finds a reference to Erlang as a grade component,
    print an informational message about support for Erlang being discontinued.

compiler/pickle.m:
compiler/make.build.m:
    Delete Erlang foreign procs and types.

compiler/add_foreign_enum.m:
compiler/add_mutable_aux_preds.m:
compiler/add_pred.m:
compiler/add_solver.m:
compiler/add_type.m:
compiler/check_libgrades.m:
compiler/check_parse_tree_type_defns.m:
compiler/code_gen.m:
compiler/compile_target_code.m:
compiler/compute_grade.m:
compiler/const_struct.m:
compiler/convert_parse_tree.m:
compiler/dead_proc_elim.m:
compiler/decide_type_repn.m:
compiler/deps_map.m:
compiler/du_type_layout.m:
compiler/export.m:
compiler/foreign.m:
compiler/globals.m:
compiler/granularity.m:
compiler/handle_options.m:
compiler/hlds_code_util.m:
compiler/hlds_data.m:
compiler/hlds_module.m:
compiler/inlining.m:
compiler/int_emu.m:
compiler/intermod.m:
compiler/item_util.m:
compiler/lambda.m:
compiler/lco.m:
compiler/llds_out_file.m:
compiler/make.dependencies.m:
compiler/make.m:
compiler/make.module_dep_file.m:
compiler/make.module_target.m:
compiler/make.program_target.m:
compiler/make.util.m:
compiler/make_hlds_separate_items.m:
compiler/make_hlds_warn.m:
compiler/mercury_compile_llds_back_end.m:
compiler/mercury_compile_main.m:
compiler/mercury_compile_middle_passes.m:
compiler/mercury_compile_mlds_back_end.m:
compiler/ml_code_util.m:
compiler/ml_foreign_proc_gen.m:
compiler/ml_target_util.m:
compiler/ml_top_gen.m:
compiler/mlds.m:
compiler/mlds_dump.m:
compiler/mlds_to_c_export.m:
compiler/mlds_to_c_file.m:
compiler/mlds_to_cs_data.m:
compiler/mlds_to_cs_export.m:
compiler/mlds_to_cs_file.m:
compiler/mlds_to_cs_type.m:
compiler/mlds_to_java_export.m:
compiler/mlds_to_java_file.m:
compiler/mlds_to_java_type.m:
compiler/module_imports.m:
compiler/parse_pragma_foreign.m:
compiler/parse_tree_out.m:
compiler/polymorphism.m:
compiler/pragma_c_gen.m:
compiler/prog_data.m:
compiler/prog_data_foreign.m:
compiler/prog_foreign.m:
compiler/prog_item.m:
compiler/simplify_goal_scope.m:
compiler/special_pred.m:
compiler/string_encoding.m:
compiler/top_level.m:
compiler/uint_emu.m:
compiler/write_deps_file.m:
    Remove references to Erlang as a backend or as a target language.

tests/invalid/bad_foreign_code.{m,err_exp}:
tests/invalid/bad_foreign_decl.{m,err_exp}:
tests/invalid/bad_foreign_enum.{m,err_exp}:
tests/invalid/bad_foreign_export.{m,err_exp}:
tests/invalid/bad_foreign_export_enum.{m,err_exp}:
tests/invalid/bad_foreign_import_module.{m,err_exp}:
tests/invalid/bad_foreign_proc.{m,err_exp}:
tests/invalid/bad_foreign_type.{m,err_exp}:
    Add a test for Erlang as an invalid foreign language. Expect both the
    new error message for this new error, and the updated list of now-valid
    foreign languages on all errors.
2020-10-29 13:24:49 +11:00
Zoltan Somogyi
49dddf74f0 Improve the style of ml_type_gen.m.
compiler/ml_type_gen.m:
    Stop exporting two of the five functions that returned flag settings,
    since they are called only from inside this module.

    Delete the other three such functions. They each had only one call site
    (one in ml_type_gen.m and two in mlds_to_java_wrap.m), so inline them
    at their respective call sites.

    Inline some local predicates at their only call sites.

    Have predicates that always construct a single class definition
    return that definition, instead of updating a list of class definitions.
    Their new signatures encode this invariant.

    Clarify the code for constructing class fields.

    Delete a long-unused predicate.

compiler/mlds.m:
    Delete fvn_sectag_const, which was generated only in the now-deleted
    long-unused predicate in ml_type_gen.m.

    Expand abbreviations in comments.

compiler/mlds_dump.m:
    Conform to the change to mlds.m.

compiler/mlds_to_java_wrap.m:
    Conform to the change to ml_type_gen.m.
2020-07-13 23:35:12 +10:00
Julien Fischer
df4c5b54a2 Fix a couple of typos.
compiler/mlds.m:
    As above.
2020-07-05 01:41:00 +10:00
Zoltan Somogyi
4743b148c2 Simplify code working with mlds_imports.
compiler/mlds.m:
    Replace the mlds_module_name field in mlds_imports with a plain
    module_name, since the only thing we ever did with that field
    was to convert it to a plain module_name :-(

    Rename the function symbol of the mlds_import type from mercury_import,
    which was confusing, to mlds_import, which shouldn't be.

    Delete an unneeded type.

compiler/ml_top_gen.m:
    Delete a loop whose every iteration did absolutely nothing.

    Conform to the change in mlds.m.

compiler/mlds_to_c_file.m:
compiler/mlds_to_cs_file.m:
compiler/mlds_to_java_file.m:
    Conform to the change in mlds.m.
2020-03-17 11:58:41 +11:00
Zoltan Somogyi
8214456761 Delete the foreign_import_module_info type.
compiler/prog_data_foreign.m:
compiler/prog_item.m:
    Replace it with the fim_spec type, which contains the exact same information.

compiler/comp_unit_interface.m:
compiler/compile_target_code.m:
compiler/make.module_dep_file.m:
compiler/mercury_compile_llds_back_end.m:
compiler/ml_top_gen.m:
compiler/mlds.m:
compiler/mlds_to_c_file.m:
compiler/module_imports.m:
compiler/parse_tree_out.m:
compiler/prog_foreign.m:
compiler/write_deps_file.m:
    Conform to the change above.
2019-09-13 14:28:03 +10:00
Zoltan Somogyi
d113e2f206 Use similar foreign type representations for parse_tree and HLDS.
The parse tree need to represent foreign_type pragmas without knowing which
foreign language it is for, while the HLDS wants to represent all the
foreign_type pragmas for a given type constructor for all foreign languages.
They used to use separate types to do this. This diff eliminates as much
of this difference as possible.

compiler/prog_data.m:
    Generalize the type (type_details_foreign) we use to represent the
    information content of foreign type pragma. Document the use of
    this type both in the parse tree (the generic version) and in the HLDS
    (versions specific to each foreign language).

compiler/hlds_data.m:
    Use instances of the newly generalized type_details_foreign type
    instead of a separate type.

compiler/add_type.m:
compiler/dead_proc_elim.m:
compiler/du_type_layout.m:
compiler/foreign.m:
compiler/intermod.m:
compiler/mlds.m:
compiler/mlds_to_java_type.m:
compiler/parse_pragma.m:
compiler/pragma_c_gen.m:
compiler/prog_foreign.m:
2019-09-10 21:53:09 +10:00
Zoltan Somogyi
b66f45e4db Tighten the mlds_type type.
compiler/mlds.m:
    Make two changes to mlds_type.

    The simpler change is the deletion of the maybe(foreign_type_assertions)
    field from the MLDS representations of Mercury types. It was never used,
    because Mercury types that are defined in a foreign language that is
    acceptable for the current MLDS target platform are represented
    as mlds_foreign_type, not as mercury_type.

    The more involved change is to change the representation of builtin types.
    Until now, we had separate function symbols in mlds_type to represent
    ints, uints, floats and chars, but not strings or values of the sized
    types {int,uint}{8,16,32,64}; those had to be represented as Mercury types.
    This is an unnecessary inconsistency. It also had two allowed
    representations for ints, uints, floats and chars, which meant that
    some of the code handling those conceptual types had to be duplicated
    to handle both representations.

    This diff provides mlds_builtin_type_{int(_),float,string,char} function
    symbols to represent every builtin type, and changes mercury_type
    to mercury_nb_type to make clear that it is NOT to be used for builtins
    (the nb is short for "not builtin").

compiler/ml_code_util.m:
compiler/ml_util.m:
    Delete functions that used to construct MLDS representations of builtin
    types. The new representation of those types is so simple that using
    such functions is no less cumbersome than writing down the representations
    directly.

compiler/ml_accurate_gc.m:
compiler/ml_call_gen.m:
compiler/ml_closure_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_foreign_proc_gen.m:
compiler/ml_global_data.m:
compiler/ml_lookup_switch.m:
compiler/ml_proc_gen.m:
compiler/ml_rename_classes.m:
compiler/ml_simplify_switch.m:
compiler/ml_string_switch.m:
compiler/ml_switch_gen.m:
compiler/ml_tag_switch.m:
compiler/ml_type_gen.m:
compiler/ml_unify_gen_construct.m:
compiler/ml_unify_gen_deconstruct.m:
compiler/ml_unify_gen_util.m:
compiler/mlds_dump.m:
compiler/mlds_to_c_data.m:
compiler/mlds_to_c_export.m:
compiler/mlds_to_c_func.m:
compiler/mlds_to_c_global.m:
compiler/mlds_to_c_stmt.m:
compiler/mlds_to_c_type.m:
compiler/mlds_to_cs_data.m:
compiler/mlds_to_cs_stmt.m:
compiler/mlds_to_cs_type.m:
compiler/mlds_to_java_data.m:
compiler/mlds_to_java_stmt.m:
compiler/mlds_to_java_type.m:
compiler/mlds_to_java_wrap.m:
compiler/rtti_to_mlds.m:
    Conform to the changes above.
2018-09-28 23:07:23 +10:00
Zoltan Somogyi
6a915eef05 Optimize field updates inside packed arg words.
Since june, we have been copying words containing packed-together
sub-word-sized arguments all in one piece if possible, for hlc grades.
This means that given a type such as

:- type t
    --->    f1(int8, bool, int8, int, bool, int8, bool).

whose first three and last three arguments are packed into one word each,
and a predicate such as

    p(T0, T) :-
        T0 = f1(A, B, C, _, E, F, G),
        D = 42,
        T  = f1(A, B, C, D, E, F, G).

we generated code such as

    MR_Integer D_12 = (MR_Integer) 42;
    MR_Unsigned packed_args_0 =
        (MR_Unsigned) ((MR_hl_field(MR_mktag(0), T0_3, (MR_Integer) 0)));
    MR_Unsigned packed_args_1 =
        (MR_Unsigned) ((MR_hl_field(MR_mktag(0), T0_3, (MR_Integer) 2)));

    base = (MR_Word) MR_new_object(MR_Word,
        ((MR_Integer) 3 * sizeof(MR_Word)), NULL, NULL);
    *T_4 = base;
    MR_hl_field(MR_mktag(0), base, 0) = (MR_Box) (packed_args_0);
    MR_hl_field(MR_mktag(0), base, 1) = ((MR_Box) (D_12));
    MR_hl_field(MR_mktag(0), base, 2) = (MR_Box) (packed_args_1);

which does NOT pick up the values A, B, C, E, F and G individually.
However, until now, we could reuse packed-together words only in their
unchanged form.

This diff lifts that limitation, which means that now, we can *also*
optimize code such as

    p(T0, T) :-
        T0 = f1(A, B, _, D, E, _, G),
        C = 42i8,
        F = 43i8,
        T  = f1(A, B, C, D, E, F, G).

by generating code like this:

    base = (MR_Word) MR_new_object(MR_Word,
        (3 * sizeof(MR_Word)), NULL, NULL);
    *T_4 = base;
    MR_hl_field(MR_mktag(0), base, 0) = (MR_Box)
        ((((packed_word_0 & (~((MR_Unsigned) 255U)))) |
        (MR_Unsigned) ((uint8_t) (C_12))));
    MR_hl_field(MR_mktag(0), base, 1) = ((MR_Box) (D_8));
    MR_hl_field(MR_mktag(0), base, 2) = (MR_Box)
        ((((packed_word_1 & (~((MR_Unsigned) 510U)))) |
        (((MR_Unsigned) ((uint8_t) (F_13)) << 1))));

The general scheme when reusing *part* of a word is: first set the bits
not being reused to zero, and then OR in new values of those bits.

Make this optimization as general as possible by making it work
not just for

- words in memory cells containing only arguments,

but also for

- words in memory cells containing a remote sectag as well as arguments, and
- words in registers cells containing a ptag, a local sectag as well as
  arguments.

compiler/ml_gen_info.m:
    Generalize the data structure we use to represent information about
    packed words to make possible approximate as well as exact lookups.
    The key in the old map was "these bitfields with the values of these
    variables in them", while the key in the new map is just "these bitfields",
    with the associated value being a list, each element of which says
    "the word with these values in those bitfields is available in this rval".
    This makes it possible to look for matches words that have some, but not
    all, of the right values in the bitfields.

    Since the packed words may now contain tags as well as arguments,
    rename "packed args" to "packed word".

compiler/ml_unify_gen_deconstruct.m:
    When deconstructing a term containing packed words, add them to the
    packed word map even when one of the bitfields inside the packed word
    contains tag information.

    Move the code that adds a packed word to the map into a separate predicate,
    now that it is needed from more than one place.

compiler/ml_unify_gen_construct.m:
    Change the code that handles packed words to work in terms of filled
    bitfields. Use this not only to implement the optimization described
    at the top, but also to make the handling of bitfields more systematic.
    At least one previous bug was caused by doing sign extension differently
    for the bitfield containing the first packed argument in a word than for
    the later packed arguments in that word; with the new design, such
    inconsistencies should not happen.

compiler/ml_unify_gen_util.m:
    Add utility predicates now needed for both construct and deconstruct
    unifications.

compiler/mlds.m:
    Document the new use of lvnc_packed_word (renamed from lvnc_packed_args).

compiler/ml_code_gen.m:
compiler/ml_code_util.m:
compiler/ml_commit_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_string_switch.m:
compiler/ml_switch_gen.m:
compiler/ml_tag_switch.m:
    Conform to the changes above (mostly the packed_word rename).

compiler/mlds_to_c_data.m:
compiler/mlds_to_c_stmt.m:
    Omit unneeded casts from the output. Specifically, don't put (MR_Integer)
    casts in front of integer constants being used either as shift amounts,
    or as the number of words that a new_object MLDS operation should allocate.
    The casts only cluttered the output, making it harder to read, and
    therefore to judge its correctness.
2018-09-10 16:17:17 +10:00
Zoltan Somogyi
86f563a94d Pack subword-sized arguments next to a remote sectag.
compiler/du_type_layout.m:
    If the --allow-packing-remote-sectag option is set, then try to pack
    an initial subsequence of subword-sized arguments next to remote sectags.

    To allow the polymorphism transformation to put the type_infos and/or
    typeclass_infos it adds to a function symbol's argument list at the
    *front* of that argument list, pack arguments next to remote sectags
    only in function symbols that won't have any such extra arguments
    added to them.

    Do not write all new code for the new optimization; instead, generalize
    the code that already does a very similar job for packing args next to
    local sectags.

    Delete the code we used to have that picked the packed representation
    over the base unpacked representation only if it reduced the
    "rounded-to-even" number of words. A case could be made for its usefulness,
    but in the presence of the new optimization the extra code complexity
    it requires is not worth it (in my opinion).

    Extend the code that informs users about possible argument order
    rearrangements that yield better packing to take packing next to sectags
    into account.

compiler/hlds_data.m:
    Provide a representation for cons_tags that use the new optimization.
    Instead of adding a new cons_tag, we do this by replacing several old
    cons_tags that all represent pointers to memory cells with a single
    cons_tag named remote_args_tag with an argument that selects among
    the old cons_tags being replaced, and adding a new alternative inside
    this new type. The new alternative is remote_args_shared with a
    remote_sectag whose size is rsectag_subword(...).

    Instead of representing the value of the "data" field in classes
    on the Java and C# backends as a strange kind of secondary tag
    that is added to a memory cell by a class constructor instead of
    having to be explicitly added to the front of the argument vector
    by the code of a unification, represent it more directly as separate
    kind of remote_args_tag. Continuing to treat it as a sectag would have
    been very confusing to readers of the code of ml_unify_gen_*.m in the
    presence of the new optimization.

    Replacing several cons_tags that were usually treated similarly with
    one cons_tag simplifies many switches. Instead of an switch with that
    branches to the same switch arm for single_functor_tag, unshared_tag
    and shared_remote_tag, and then switches on these three tags again
    to get e.g. the primary tag of each, the new code of the switch arm
    is executed for just cons_tag value (remote_args_tag), and switches
    on the various kinds of remote args tags only when it needs to.
    In is also more natural to pass around the argument of remote_args_tag
    than to pass around a variable of type cons_tag that can be bound to only
    single_functor_tag, unshared_tag or shared_remote_tag.

    Add an XXX about possible further steps along these lines, such as
    making a new cons_tag named something like "user_const_tag" represent
    all user-visible constants.

compiler/unify_gen_construct.m:
compiler/unify_gen_deconstruct.m:
compiler/unify_gen_test.m:
compiler/unify_gen_util.m:
compiler/ml_unify_gen_construct.m:
compiler/ml_unify_gen_deconstruct.m:
compiler/ml_unify_gen_test.m:
compiler/ml_unify_gen_util.m:
    Implement X = f(Yi) unifications where f uses the new representation,
    i.e. some of its arguments are stored next to a remote sectag.

    Some of the Yi are stored in a tagword (a word that also contains a tag,
    in this case the remote secondary tag), while some are stored in other
    words in a memory cell. This means that such unifications have similarities
    both to unifications involving arguments being packed next to local
    sectags, and to unifications involving ordinary arguments in memory cells.
    Therefore wherever possible, their implemenation uses suitably generalized
    versions of existing code that did those two jobs for two separate kinds of
    cons_tags.

    Making such generalizations possible in some cases required shifting the
    boundary between predicates, moving work from a caller to a callee
    or vice versa.

    In unify_gen_deconstruct.m, stop using uni_vals to represent *either* a var
    *or* a word in a memory cell. While this enabled us to factor out some
    common code, the predicate boundaries it lead to are unsuitable for the
    generalizations we now need.

    Consistently use unsigned ints to represent both the whole and the parts
    of words containing packed arguments (and maybe sectags), except when
    comparing ptag constants with the result of applying the "tag" unop
    to a word, (since that unop returns an int, at least for now).

    In a few cases, avoid the recomputation of some information that we
    already know. The motivation is not efficiency, since the recomputation
    we avoid is usually cheap, but the simplification of the code's correctness
    argument.

    Use more consistent terminology in things such as variable names.

    Note the possibility of further future improvements in several places.

compiler/ml_foreign_proc_gen.m:
    Delete a long unused predicate.

compiler/mlds.m:
    Add an XXX documenting a possible improvement.

compiler/rtti.m:
    Update the compiler's internal representation of RTTI data structures
    to make them able to describe secondary tags that are smaller than
    a full word.

compiler/rtti_out.m:
    Conform to the changes above, and delete a long-unused predicate.

compiler/type_ctor_info.m:
    Use the RTTI's du_hl_rep to represent cons_tags that distinguish
    between function symbols using a field in a class.

compiler/ml_type_gen.m:
    Provide a specialized form of a function for code in ml_unify_gen_*.m.
    Conform to the changes above.

compiler/add_special_pred.m:
compiler/bytecode_gen.m:
compiler/export.m:
compiler/hlds_code_util.m:
compiler/lco.m:
compiler/ml_closure_gen.m:
compiler/ml_switch_gen.m:
compiler/ml_tag_switch.m:
compiler/rtti_to_mlds.m:
compiler/switch_util.m:
compiler/tag_switch.m:
    Conform to the changes above.

runtime/mercury_type_info.h:
    Update the runtime's representation of RTTI data structures to make them
    able to describe remote secondary tags that are smaller than a full word.

runtime/mercury_deconstruct.[ch]:
runtime/mercury_deconstruct.h:
runtime/mercury_deconstruct_macros.h:
runtime/mercury_ml_expand_body.h:
runtime/mercury_ml_arg_body.h:
runtime/mercury_ml_deconstruct_body.h:
runtime/mercury_ml_functor_body.h:
    These modules collectively implement the predicates in deconstruct.m
    in the library, and provide access to its functionality to other C code,
    e.g. in the debugger. Update these to be able to handle terms with the
    new data representation optimization.

    This update requires a significant change in the distribution of work
    between these files for the predicates deconstruct.deconstruct and
    deconstruct.limited_deconstruct. We used to have mercury_ml_expand_body.h
    fill in the fields of their expand_info structures (whose types are
    defined in mercury_deconstruct.h) with pointers to three vectors:
    (a) a vector of arg_locns with one element per argument, with a NULL
    pointer being equivalent to a vector with a given element in every slot;
    (b) a vector of type_infos with one element per argument, constructed
    dynamically (and later freed) if necessary; and (c) a vector of argument
    words. Once upon a time, before double-word and sub-word arguments,
    vector (c) also had one word per argument, but that hasn't been true
    for a while; we added vector (a) help the consumers of the expand_info
    decode the difference. The consumers of this info  always used these
    vectors to build up a Mercury term containing a list of univs,
    with one univ for each argument.

    This structure could be stretched to handle function symbols that store
    *all* their arguments in a tagword next to a local sectag, but I found
    that stretching it to cover function symbols that have *some* of their
    arguments packed next to a remote sectag and *some other* of their
    arguments in a memory cell as usual would have required a well-nigh
    incomprehensibly complex, and therefore almost undebuggable, interface
    between mercury_ml_expand_body.h and the other files above. This diff
    therefore changes the interface to have mercury_ml_expand_body.h
    build the list of univs directly. This make its code relatively simple
    and self-contained, and it should be somewhat faster then the old code
    as well, since it never needs to allocate, fill in and then free
    vectors of type_infos (each such typeinfo now gets put into a univ
    as soon as it is constructed). The downside is that if we ever wanted
    to get all the arguments at once for a purpose other than constructing
    a list of univs from them, it would nevertheless require constructing
    that list of univs anyway as an intermediate data structure. I don't see
    this downside is significant, because (a) I don't think such a use case
    is very likely, and (b) even if one arises, debuggable but a bit slow
    is probably preferable to faster but very hard to debug.

    Reduce the level of indentation of some of these files to make the code
    easier to edit. Do this by

    - not adding an indent level from switch statements to their cases; and
    - not adding an indent level when a case in a switch has a local block.

    Move the break or return ending a case inside that case's block,
    if it has one.

runtime/mercury_deep_copy_body.h:
runtime/mercury_table_type_body.h:
    Update these to enable the copying or tabling of terms whose
    representations uses the new optimization.

    Use the techniques listed above to reduce the level of indentation
    make the code easier to edit.

runtime/mercury_tabling.c:
runtime/mercury_term_size.c:
    Conform to the changes above.

runtime/mercury_unify_compare_body.h:
    Make this code compile after the changes above. It does need to work
    correctly, since we only ever used this code to compare the speed
    of unify-by-rtti with the speed of unify-by-compiler-generated-code,
    and in real life, we always use the latter. (It hasn't been updated
    to work right with previous arg packing changes either.)

library/construct.m:
    Update to enable the code to construct terms whose representations
    uses the new optimization.

    Add some sanity checks.

library/private_builtin.m:
runtime/mercury_dotnet.cs.in:
java/runtime/Sectag_Locn.java:
    Update the list of possible sectag kinds.

library/store.m:
    Conform to the changes above.

trace/mercury_trace_vars.c:
    Conform to the changes above.

tests/hard_coded/deconstruct_arg.{m,exp,exp2}:
    Extend this test to test the deconstruction of terms whose
    representations uses the new optimization.

    Modify some of the existing terms being tested to make them more diverse,
    in order to make the output easier to navigate.

tests/hard_coded/construct_packed.{m,exp}:
    A new test case to test the construction of terms whose
    representations uses the new optimization.

tests/debugger/browse_packed.{m,exp}:
    A new test case to test access to the fields of terms whose
    representations uses the new optimization.

tests/tabling/test_packed.{m,exp}:
    A new test case to test the tabling of terms whose
    representations uses the new optimization.

tests/debugger/Mmakefile:
tests/hard_coded/Mmakefile:
tests/tabling/Mmakefile:
    Enable the new test cases.
2018-08-30 05:14:38 +10:00
Zoltan Somogyi
18bafbd70d Split up mlds_to_cs.m.
compiler/mlds_to_cs.m:
    Delete this module. Move its contents to the following ten modules.

compiler/mlds_to_cs_class.m:
    Code to output class definitions.

compiler/mlds_to_cs_data.m:
    Code to output lvals, rvals and initializers.

compiler/mlds_to_cs_export.m:
    Code to output entities (e.g. enums) exported to Java.

compiler/mlds_to_cs_file.m:
    The top level code, generating whole Java files.

compiler/mlds_to_cs_func.m:
    Code to output function definitions.

compiler/mlds_to_cs_global.m:
    Code to output the definitions of global variables.

compiler/mlds_to_cs_name.m:
    Code to output various kinds of names.

compiler/mlds_to_cs_stmt.m:
    Code to output statements.

compiler/mlds_to_cs_type.m:
    Code to output types.

compiler/mlds_to_cs_util.m:
    Utilities used by the other mlds_to_cs_*.m modules.

compiler/ml_backend.m:
    Delete the old module, add the new modules.

compiler/Mercury.options:
    Require the new modules to have the declarations and definitions
    of their predicates in a consistent order.

compiler/mercury_compile_mlds_back_end.m:
    Import mlds_to_cs_file.m instead of mlds_to_cs.m.

compiler/handle_options.m:
compiler/ml_target_util.m:
compiler/mlds.m:
    Update some references to the deleted file.
2018-06-30 01:00:51 +02:00
Zoltan Somogyi
4404d7d884 Split up mlds_to_java.m.
compiler/mlds_to_java.m:
    Delete this module. Move its contents to the following eleven modules.

compiler/mlds_to_java_class.m:
    Code to output class definitions.

compiler/mlds_to_java_data.m:
    Code to output lvals, rvals and initializers.

compiler/mlds_to_java_export.m:
    Code to output entities (e.g. enums) exported to Java.

compiler/mlds_to_java_file.m:
    The top level code, generating whole Java files.

compiler/mlds_to_java_func.m:
    Code to output function definitions.

compiler/mlds_to_java_global.m:
    Code to output the definitions of global variables.

compiler/mlds_to_java_name.m:
    Code to output various kinds of names.

compiler/mlds_to_java_stmt.m:
    Code to output statements.

compiler/mlds_to_java_type.m:
    Code to output types.

compiler/mlds_to_java_util.m:
    Utilities used by the other mlds_to_java_*.m modules.

compiler/mlds_to_java_wrap.m:
    Code to create wrapper classes, to help implement function pointers.

compiler/ml_backend.m:
    Delete the old module, add the new modules.

compiler/Mercury.options:
    Require the new modules to have the declarations and definitions
    of their predicates in a consistent order.

compiler/mercury_compile_mlds_back_end.m:
    Import mlds_to_java_file.m instead of mlds_to_java.m.

compiler/mlds.m:
compiler/mlds_to_cs.m:
    Update some references to the deleted file.

compiler/mlds_to_c_class.m:
    Delete some stray ZZZs.
2018-06-29 18:13:12 +02:00
Zoltan Somogyi
ec6a40ed85 Put related args of ml_field next to each other.
compiler/mlds.m:
    Put the *type* of the pointer next to the *value* of the pointer.

compiler/ml_accurate_gc.m:
compiler/ml_call_gen.m:
compiler/ml_closure_gen.m:
compiler/ml_code_util.m:
compiler/ml_elim_nested.m:
compiler/ml_optimize.m:
compiler/ml_rename_classes.m:
compiler/ml_string_switch.m:
compiler/ml_type_gen.m:
compiler/ml_unify_gen.m:
compiler/ml_unused_assign.m:
compiler/ml_util.m:
compiler/mlds_dump.m:
compiler/mlds_to_c_data.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/mlds_to_target_util.m:
compiler/rtti_to_mlds.m:
    Conform to the change above.
2018-06-04 23:28:19 +02:00
Zoltan Somogyi
bbe0f28f3b Copy packed arguments all at once.
Copy words containing packed-together sub-word-sized arguments all
in one piece if possible, for hlc grades.

Given a type such as

:- type t
    --->    f1(int8, bool, int8, int, bool, int8, bool).

whose first three and last three arguments are packed into one word each,
and a predicate such as

p(T0, T) :-
    T0 = f1(A, B, C, _, E, F, G),
    D = 42,
    T  = f1(A, B, C, D, E, F, G).

we used to generate code that picked up each of the six named arguments
from T0, and used them to construct T. With this diff, we now translate
the above to

    MR_Integer D_12 = (MR_Integer) 42;
    MR_Unsigned packed_args_0 =
        (MR_Unsigned) ((MR_hl_field(MR_mktag(0), T0_3, (MR_Integer) 0)));
    MR_Unsigned packed_args_1 =
        (MR_Unsigned) ((MR_hl_field(MR_mktag(0), T0_3, (MR_Integer) 2)));

    base = (MR_Word) MR_new_object(MR_Word,
        ((MR_Integer) 3 * sizeof(MR_Word)), NULL, NULL);
    *T_4 = base;
    MR_hl_field(MR_mktag(0), base, 0) = (MR_Box) (packed_args_0);
    MR_hl_field(MR_mktag(0), base, 1) = ((MR_Box) (D_12));
    MR_hl_field(MR_mktag(0), base, 2) = (MR_Box) (packed_args_1);

compiler/ml_unify_gen.m:
    Implement the two main parts of this optimization.

    Part one is the change to deconstruction unifications. When we generate
    assignments from all the fields packed together into a word to their
    corresponding argument variables (such as A/B/C or E/F/G above),
    create a fresh variable (such as packed_args_0 above), assign to it
    the value of the whole word, and record in a new data structure (the
    packed_args_map) that these argument variables, in these positions
    within the word, are now available in the newly created variable.
    (We still define the argument variables as well, since they may be needed;
    deleting them if they are *not* needed is the job of ml_unused_assign.m.)

    Part two is the change to construction unifications. When we generate code
    to OR together the shifted and/or masked values of two or more variables
    to fill in one word in a new heap cell, we search the packed_args_map
    to see whether those variables, in the positions we need, are available
    in one of the variables created in part one. If yes, we discard
    the whole OR-ing together operation and we use that variable instead.

    Since part one can now create local variable definitions, return these
    upwards as needed.

compiler/ml_gen_info.m:
    Add two fields to the ml_gen_info structure (actually, to one of its
    substructures). One is the packed_args_map described above, the other
    is a counter we use to give a unique name to all the fresh variables.

    When creating ml_gen_infos, put the code defining each field of a
    substructure next to the creation of that substructure.

compiler/mlds.m:
    Add a kind of compiler-generated variable holding packed argument words.
    It is used in part one above.

compiler/ml_code_gen.m:
compiler/ml_code_util.m:
compiler/ml_commit_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_string_switch.m:
compiler/ml_switch_gen.m:
compiler/ml_tag_switch.m:
    Save, reset and restore the packed_args_map as necessary to ensure that
    a construction unification sees an entry in that map only if the
    deconstruction unification that created that entry *had* to be executed
    before execution reaches the construction unification.

    This means that when we process a branched control structure, we have to
    make sure that (a) entries created by one branch are not seen when
    we generate code for the other branches, and (b) that code *after* the
    branched control structure sees only the entries created *before* the
    branched control structure, since such code following cannot use an entry
    that was created by a branch that may or may NOT have been executed
    on the way there.

    We also reset the packed_args_map to empty when generating code
    that will end up inside a nested function, for two reasons. First,
    I am not sure whether the code in ml_elim_nested.m that flattens out
    nested functions is general enough to handle the new kind of compiler
    generated variable correctly. And second, even if it is, the additional
    memory traffic for putting those variables into environments, and later
    pulling them out again, would definitely reduce and maybe completely
    eliminate the speedup from optimizing constructions.

compiler/ml_closure_gen.m:
    Conform to the change in ml_unify_gen.m.

compiler/ml_proc_gen.m:
    Invoke ml_unused_assign.m in both branches of an if-then-else.
    Previously, it was invoked in only the rarely executed branch,
    which is what hid its bugs.

    Fix one bug: for model_semi procedures, include the succeeded variable
    in the set of variables whose values is needed after the generated
    function body.

    Work around another bug: the ml_unused_assign.m cannot yet handle
    nested functions properly, so throw away its output in their presence.

compiler/ml_unused_assign.m:
    As part of the same workaround, if a block contains nested functions,
    tell ml_proc_gen.m to use the original code.

    Fix several other bugs.

    Don't delete variables from the seen_set when the backwards traversal
    finds an assignment to them, because the variable's absence from
    the seen_set would lead to the declaration of the variable being deleted.

    Delete a sanity check that made sense only the presence of such deletions.

    Never delete assignments to compiler-generated variables; we generate
    such assignments only when their results *will* be needed.

    When exiting the traversal of a block, *do* delete the variables
    declared locally in that block from the seen_set; being undeclared there,
    they cannot possibly be seen before that block. leaving them in
    does not compromise correctness, but does reduce performance
    by making operations on the seen_set slower than necessary.

    If deleting unused assignments makes the else part of an if-then-else
    empty, then delete the whole else part.

compiler/mlds_to_c_stmt.m:
    Generate a valid C statement even for an MLDS comment. When an buggy
    version of ml_unused_assign.m (incorrectly) deleted assignments to
    succeeded, it sometimes left an else part containing only a comment,
    which lead gcc to report syntax errors.
2018-06-02 18:56:40 +02:00
Zoltan Somogyi
5bb7a1b727 Delete a remnant of reserved objects.
compiler/mlds.m:
    Delete an old field_var_name form that we haven't used for a while,
    and won't use ever again.
2018-06-02 15:23:01 +02:00
Zoltan Somogyi
2ea99dc61a Fix obsolete comments. 2018-05-22 15:18:02 +02:00
Zoltan Somogyi
94740ed865 Delete func_local.
compiler/mlds.m:
    We haven't supported the hl*_nest grades for more than half a year,
    so delete the func_local function symbol (representing the accessibility
    of nested functions) in the function_access type.

compiler/ml_code_util.m:
    Create nested functions as func_private, not func_local. (ml_elim_nested.m
    will ignore the function's accessibility anyway when it lifts the nested
    function out of its block.)

compiler/mlds_to_c_stmt.m:
    Throw an exception if the MLDS block being output has a nested function
    left in it.

compiler/mlds_to_c_func.m:
compiler/mlds_to_c_util.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
    Conform to the change above.
2018-05-15 11:49:50 +02:00
Zoltan Somogyi
b9afc8b78e Delete the mlds_unary_op type.
compiler/mlds.m:
    We used to have a function symbol ml_unop in the mlds_rval type
    that applied one of four kinds of operations to an argument mlds_rval:
    boxing, unboxing, casting or a standard unary operation, with a value
    of type mlds_unary_op selecting between the four. Replace this system
    with four separate function symbols in the mlds_rval type directly,
    and delete the mlds_unary_op type.

    The new arrangement requires fewer memory cells to be allocated,
    and less indirection; it also leads to shorter and somewhat
    more readable code.

compiler/ml_optimize.m:
    Conform to the change above.

    Recognize that a cast has negligible cost.

compiler/ml_code_util.m:
    Conform to the change above.

    Keep private a predicate that is not used by any other module,
    after merging it with another previously-exported predicate
    that only *it* uses.

    Delete some other predicates that are not used anywhere.

compiler/ml_accurate_gc.m:
compiler/ml_call_gen.m:
compiler/ml_closure_gen.m:
compiler/ml_code_gen.m:
compiler/ml_disj_gen.m:
compiler/ml_elim_nested.m:
compiler/ml_foreign_proc_gen.m:
compiler/ml_global_data.m:
compiler/ml_lookup_switch.m:
compiler/ml_rename_classes.m:
compiler/ml_string_switch.m:
compiler/ml_tag_switch.m:
compiler/ml_unify_gen.m:
compiler/ml_unused_assign.m:
compiler/ml_util.m:
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/mlds_to_target_util.m:
compiler/rtti_to_mlds.m:
    Conform to the change above.
2018-05-13 12:23:38 +02:00
Zoltan Somogyi
fcefbb948d Delete assignments to dead variables in the MLDS.
At the moment, we tend not to generate such assignments, with the exception
of assignments to the MLDS versions of HLDS variables of dummy types.
The reason I am nevertheless adding this optimization is that I intend
to soon add code to ml_unify_gen.m that *will* generate assignments
to dead variables.

The idea is to optimize field updates involving packed arguments.
Given a type such as

:- type t
    --->    f(
                f1      :: bool,
                f2      :: bool,
                f3      :: enum1,
                f4      :: int
            ).

we currently implement a field update such as "T = T0 ^ f4 := 42",
whose HLDS representation is the two unifications

    T0 = f(T0f1, T0f2, T0f3, _),
    T = f(T0f1, T0f2, T0f3, 42)

using code that looks like this:

    T0f1 = (T0[0] >> ...) & ...
    T0f2 = (T0[0] >> ...) & ...
    T0f3 = (T0[0] >> ...) & ...
    T = allocate memory for new memory cell, put on primary tag
    T[0] = (T0f1 << ...) | (T0f2 << ...) | (T0f3 << ...)
    T[1] = 42

I want to implement it using code that looks like this:

    T0w0 = T0[0]
    T = allocate memory for new memory cell, put on primary tag
    T[0] = T0w0
    T[1] = 42

where T0w0 contains the entire first word of the memory cell of T0.
This code avoids a bunch of shifts, ORs and ANDs.

I propose to translate the T0 = f(T0f1, T0f2, T0f3, _) unification into

    T0w0 = T0[0]
    T0f1 = (T0[0] >> ...) & ...
    T0f2 = (T0[0] >> ...) & ...
    T0f3 = (T0[0] >> ...) & ...

while recording in the ml_gen_info/code_info that this *specific* packing of
T0f1, T0f2 and T0f3 is available in T0w0. When translating the following
unification, the code generator will see this, and this will allow it to
generate

    T[0] = T0w0

instead of

    T[0] = (T0f1 << ...) | (T0f2 << ...) | (T0f3 << ...)

However, by this time the assignments to T0f1, T0f2 and T0f3 have already
been generated. Whether or not they are dead assignments depends on whether
other code needs the values of those fields of T0. Deciding this
requires knowledge that the code generator can't have when translating
the deconstruction of T0. Hence the need for a new MLDS-to-MLDS optimization.

compiler/ml_unused_assign.m:
    A new compiler module implementing the new optimization.
    It is not part of ml_optimize.m because ml_optimize.m traverses
    the MLDS forwards, while this optimization requires a backwards traversal:
    you cannot know whether an assignment is dead unless you know that the
    following code does not need the value of the variable it assigns to.

compiler/ml_backend.m:
compiler/notes/compiler_design.html:
    Include the new module.

compiler/mlds.m:
    The new optimization needs extra information about loops.
    When it enters into the loop body, it knows which variables
    are needed *after* the loop, but it does not know which variables
    the loop body first reads and then writes. Without this knowledge,
    it would optimize away assignments to loop control variables,
    such as the increment of i in the loop

    i = 0;
    while (...) {
        ...; i = i+1; ...
    }

    Traditionally, compilers have solved this problem by doing fixpoint
    iteration, adding to the live set at each program point until
    no more additions are possible. We can do better, because we generate
    loops in the MLDS in only two kinds of cases:

    - loops implementing tail recursion, in which case the only extra
      variables that we need to preserve assignments to in the loop body
      are the input arguments of the procedure, and
    - loops created by the compiler itself to loop over a set of alternatives,
      for which the only extra variables that we need to preserve assignments
      to in the loop body are the variables the compiler uses to control
      the loop.

    To make it possible for ml_unused_assign.m to do its job without
    a fixpoint iteration, include in the MLDS representation of every
    while loop a list of these variables.

    Add a type to represent the identify of an MLDS local var,
    for use by some of the modules below. They used to store this info
    in the form of mlds_lvals, but that is not specific enough
    to be used to fill in the new field in while loops.

compiler/ml_proc_gen.m:
    Compute the information needed by the new pass, and invoke it
    if the relevant option is set.

compiler/options.m:
    Add this option. It is for developers only, so it is undocumented.

compiler/ml_util.m:
    Add a utility function needed in several places.

compiler/ml_accurate_gc.m:
compiler/ml_disj_gen.m:
compiler/ml_elim_nested.m:
compiler/ml_lookup_switch.m:
compiler/ml_optimize.m:
compiler/ml_rename_classes.m:
compiler/ml_string_switch.m:
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/mlds_to_target_util.m:
    Conform to the changes in mlds.m.
2018-05-09 23:56:28 +02:00
Zoltan Somogyi
fc903a0911 Eliminate the double storage of types in the MLDS.
compiler/mlds.m:
    When we record a Mercury type in the MLDS, we used to record with it
    not just its type category (which some aux predicates need), but also
    the name by which it is known to the target language compiler, if the
    type is defined in that foreign language. Unfortunaly, the data structure
    we used to represent the name of the foreign type (and any assertions
    on it) also stored a duplicate copy of the Mercury type in the usual
    case where the Mercury type was *not* defined in the foreign language.
    Having two copies of the same information was dangerous, due to the
    possibility of inconsistency between them. It was also unnecessary work
    for the compiler passes that had to create the duplicate copies.

    Eliminate these problems by always storing *one* copy of the Mercury type.

    Store the Mercury and foreign type information next to each other.

compiler/foreign.m:
    Make the above possible by deleting up the old exported_type type,
    which contained the duplicate copy of the Mercury type in usual case
    of a type that is not defined by foreign code, and replacing it
    with a type that contains information about just a foreign type.

    In the argument lists of the predicates and functions of this module,
    replace arguments that used to be type exported_type with a pair
    of the Mercury type and a maybe of the new type, which is yes(...)
    iff the Mercury type *is* defined in foreign code.

    Give some predicates and functions more meaningful names.

    Make specialized versions of these functions available (specialized
    e.g. to a target language) where these would be useful.

    Delete the auxiliary predicates that aren't needed with the
    new data structure design.

compiler/export.m:
compiler/ml_accurate_gc.m:
compiler/ml_code_util.m:
compiler/ml_foreign_proc_gen.m:
compiler/ml_global_data.m:
compiler/ml_simplify_switch.m:
compiler/ml_switch_gen.m:
compiler/ml_unify_gen.m:
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/rtti_to_mlds.m:
    Conform to the changes above.
2018-05-08 00:17:34 +02:00
Zoltan Somogyi
24b98fdafe Pack sub-word-sized ints and dummies in terms.
Previously, the only situation in which we could pack two or more arguments
of a term into a single word was when all those arguments are enums. This diff
changes that, so that the arguments can also be sub-word-sized integers
(signed or unsigned), or values of dummy types (which occupy zero bits).

This diff also records, for each argument of a function symbol, not just
whether, and if yes, how it is packed into a word, but also at *what offset*
that word is in the term's heap cell. It is more economical to compute this
once, when the representation of the type is being decided, than to compute
it over and over again when terms with that function symbol are being
constructed or deconstructed. However, for a transition period, we compute
these offsets at *both* times, to check the consistency of the new algorithm
for computing offsets that is run at "decide representation time" with
the old algorithms run at "generate code for a unification time".

compiler/du_type_layout.m:
    Make the changes described above: pack sub-word-sized integers and
    dummy values into argument words, if possible, and if the relevant
    new option allows it. These options are temporary. If we find no problems
    with the new packing algorithm in a few weeks, we should be able to
    delete them.

    Allow 64 bit ints and uints to be stored in unboxed in two words
    on 32 bit platforms, if the relevant new option allows it. Support
    for this is not yet complete, but it makes sense to implement the
    RTTI changes for both this change and one described in the above
    paragraph together.

    For each packed argument, record not just its width, its shift and
    the mask, but also the number of bits the argument takes. Previously,
    we computed this on demand from the mask, but there is no real need
    for that when simply storing this info is so cheap.

    For all arguments, packed or not, record its offset, relative to both
    the start of the arguments, and the start of the memory cell. (The two
    are different if the arguments are preceded by either a remote secondary
    tag, the typeinfos and/or typeclass_infos describing some existentially
    typed arguments, or both.) The reason for this is given at the top.

    Centralize the decision of the parameters of packing in one predicate.

    If the option --inform-suboptimal-packing is given, print an informational
    message whenever the code deciding type representations finds that
    reordering the arguments of a function symbol would allow it to pack
    the arguments of that function symbol into less space.

compiler/options.m:
    Add the option --allow-packing-ints which controls whether
    du_type_layout.m will attempt to pack {int,uint}{8,16,32} arguments
    alongside enum arguments.

    Add the option --allow-packing-dummies which controls whether
    du_type_layout.m will optimize away (in other words, represent in 0 bits)
    arguments of dummy types.

    Add the option --allow-double-word-ints which controls whether
    du_type_layout.m will store arguments of the types int64 and uint64
    unboxed in two words on 32 bit platforms, the way it currently stores
    double precision floats.

    All three those options are off by default, which preserves binary
    compatibility with existing code. However, the first two are ready
    to be switched on (the third is not).

    All three options are intended to be present in the compiler
    only until these changes are tested. Once we deem them sufficiently
    tested, I will modify the compiler to always do the packing they control,
    at which point we can delete these options. This is why they are not
    documented.

    Add the option --inform-suboptimal-packing, whose meaning is described
    above.

doc/user_guide.texi:
    Document --inform-suboptimal-packing.

compiler/prog_data.m:
    For each argument of a function symbol in a type definition, use
    a new type called arg_pos_width to record the extra information
    mentioned above in (offsets for all arguments, and number of bits
    for packed arguments).

    For each function symbol that has some existential type constraints,
    record the extra information mentioned for parse_type_defn.m below.

compiler/hlds_data.m:
    Include the position, as well as the width, in the representation
    of the arguments of function symbols.

    Previously, we used the integer 0 as a tag for dummies. Add a tag to
    represent dummy values, since this gives more information to any code
    that sees that tag.

compiler/ml_unify_gen.m:
compiler/unify_gen.m:
    Handle the packing of dummy values, and of sub-word-sized ints and uints.

    Compare the cell offset of each argument computed using existing
    algorithms here with the cell offset recorded in the argument's
    representation, and abort if they are different.

    In some cases, restructure code a bit to make it possible.
    For example, for tuples and closures, this means that instead of
    simply recording that each tuple argument or closure element
    is a full word, we must record its correct offset as well.

    Handle the new dummy_tag.

    Add prelim (not yet finished) support for double-word int64s/uint64s
    on 32 bit platforms.

    When packing the values of two or more variables (or constants) into a
    single word in a memory cell, optimize away operations that are no-ops,
    such as shifting anything by zero bits, shifting the constant zero
    by any number of bits, and ORing anything with zero. This makes the
    generated code easier to read. It is probably also faster for us
    to do it here than to write out a bigger expression, have the C compiler
    read in the bigger expression, and then later make the same optimization.

    In ml_unify_gen.m, avoid the unnecessary use of a list of the argument
    variables' types separate from the list of the argument variables
    themselves; just look up the type of each argument variable when it is
    processed.

compiler/add_special_pred.m:
    When creating special (unify and compare) predicates for tuples,
    include the offsets in the representation of their arguments.

    Delete an unused predicate.

compiler/llds.m:
    Add a new way to create an rval: a cast. We use it to implement
    the extraction of signed sub-word-sized integers from packed argument
    words in terms. Masking the right N bits out of the packed word
    leaves the other 32-N or 64-N bits as zeroes; a cast to int8_t,
    int16_t or int32_t will copy the sign bit to these bits.
    Likewise, when we pack signed int{8,16,32} values into words,
    we cast them to their unsigned versions to throw away any sign-extension
    bits in their original word-sized representations.

    No similar change is needed for the MLDS, since that already had
    a mechanism for casts.

compiler/mlds.m:
    Note a potential simplification in the MLDS.

compiler/builtin_lib_types.m:
    Add functions to return the Mercury representation of the int64
    and uint64 types.

compiler/foreign.m:
    Export a specialized version of an existing predicate, to allow
    ml_unify_gen.m to avoid the costs of the more general version.

compiler/hlds_out_module.m:
    Always print the representations of all arguments, since the
    inclusion of position information in those representation means that
    the representations of even all-full-word-argument terms are of potential
    interest when debugging term representations.

compiler/lco.m:
    Do not try to apply LCO to arguments of dummy types. (We could optimize
    them differently, by filling them in before they are "computed", but
    that is a separate optimization, which is of *very* low priority.)

compiler/liveness.m:
    Do not include variables of dummy types in resume points.

    The reason for this is that the code that establishes a resume point
    returns, for each such variable, a list of *lvals* where that variable
    can be found. The new code in unify_gen.m will optimize away assignments
    to values of dummy types, so there is *no* lval where they can be found.
    We could allocate one, but doing so would be a pessimization. Instead,
    we simply don't save and restore such values. When their value (which is
    always 0) is needed, we can create them out of thin air.

compiler/ml_global_data.m:
    Include the target language in the ml_global_data structure, to prevent
    some of its users having to look it up in the module_info.

    Add notes about the specializing the implementation of arrays of
    int64s/uint64s on 32 bit platforms.

compiler/check_typeclass.m:
compiler/ml_type_gen.m:
    Add sanity checks of the new precomputed fields of exist_constraints.

    Conform to the changes above.

compiler/mlds_to_c.m:
    Add prelim (not yet finished) support for double-word int64s/uint64s
    on 32 bit platforms.

    Add notes about possible optimizations.

compiler/parse_type_defn.m:
    When a function symbol in a type definition contains existential
    arguments, precompute and store the set of constrained and unconstrained
    type variables. The code in du_type_layout.m needs this information
    to compute the number of slots occupied by typeinfos and typeclass_infos
    in memory cells for this function symbol, and several other places
    in the compiler do too. It is easier and faster to compute this
    information just once, and this is the earliest time what that can be done.

compiler/type_ctor_info.m:
    Use the prerecorded information about existential types to simplify
    the code here

compiler/polymorphism.m:
    Add an XXX about possibly using the extra info we now record in
    exist_constraints to simplify the job of polymorphism.m.

compiler/pragma_c_gen.m:
compiler/var_locn.m:
    Create the values of dummy variables from scratch, if needed.

compiler/rtti.m:
    Replace a bool with a bespoke type.

compiler/rtti_out.m:
compiler/rtti_to_mlds.m:
    When generating RTTI information for the LLDS and MLDS backends
    respectively, record new kinds of arguments as needing special
    treatment. These are int64s and uint64s stored unboxed in two words
    on 32 bit platforms, {int,uint}{8,16,32} values packed into words,
    and dummy arguments. Each of these has a special code: its own negative
    negative value in the num_bits field of the argument.

    Generate slightly better formatted output.

compiler/type_util.m:
    Delete a predicate that isn't needed anymore.

compiler/opt_util.m:
    Delete a function that hasn't been needed for a while.

    Conform to the changes above.

compiler/arg_pack.m:
compiler/bytecode_gen.m:
compiler/call_gen.m:
compiler/code_util.m:
compiler/ctgc.selector.m:
compiler/dupelim.m:
compiler/dupproc.m:
compiler/equiv_type.m:
compiler/equiv_type_hlds.m:
compiler/erl_code_gen.m:
compiler/erl_rtti.m:
compiler/export.m:
compiler/exprn_aux.m:
compiler/global_data.m:
compiler/jumpopt.m:
compiler/livemap.m:
compiler/llds_out_data.m:
compiler/middle_rec.m:
compiler/ml_closure_gen.m:
compiler/ml_switch_gen.m:
compiler/ml_top_gen.m:
compiler/module_qual.qualify_items.m:
compiler/opt_debug.m:
compiler/parse_tree_out.m:
compiler/peephole.m:
compiler/recompilation.usage.m:
compiler/resolve_unify_functor.m:
compiler/stack_layout.m:
compiler/structure_reuse.direct.choose_reuse.m:
compiler/switch_util.m:
compiler/typecheck.m:
compiler/unify_proc.m:
compiler/unused_imports.m:
compiler/xml_documentation.m:
    Conform to the changes above.

compiler/llds_out_util.m:
    Add a comment.

compiler/ml_code_util.m:
    Factor out some common code.

runtime/mercury_type_info.h:
    Allocate special values of the MR_arg_bits field of the MR_DuArgLocn type
    to designate arguments as two word int64/uint64s, as sub-word-sized
    arguments of types {int,uint}{8,16,32}, or as arguments of dummy types.
    (We already had a special value for two word float arguments.)

    Document the list of places that know about this code, so that they
    can be updated if and when it changes.

library/construct.m:
    Handle the construction of terms with two-word int64/uint64 arguments,
    with packed {int,uint}{8,16,32} arguments, and with dummy arguments.

    Factor out the code common to the sectag-present and sectag-absent cases,
    to make it possible to do the above in just *one* place.

library/store.m:
    Add an XXX to a place that I don't think handles two word arguments
    correctly. (I think this is an old bug.)

runtime/mercury_deconstruct.c:
    Handle the deconstruction of terms with two-word int64/uint64 arguments,
    with packed {int,uint}{8,16,32} arguments, and with dummy arguments.

runtime/mercury_deep_copy_body.h:
    Handle the copying of terms with two-word int64/uint64 arguments,
    with packed {int,uint}{8,16,32} arguments, and with dummy arguments.

    Give a macro a more descriptive name.

runtime/mercury_type_info.c:
    Handle taking the size of terms with two-word int64/uint64 arguments,
    with packed {int,uint}{8,16,32} arguments, and with dummy arguments.

runtime/mercury.h:
    Put related definitions next to each other.

runtime/mercury_deconstruct.h:
runtime/mercury_ml_expand_body.h:
    Fix indentation.

tests/hard_coded/construct_test.{m,exp}:
    Add to this test case a test of the construction, via the library's
    construct.m module, of terms containing packed sub-word-sized integers,
    and packed dummies.

tests/hard_coded/deconstruct_arg.{m,exp}:
    Convert the source code of this test case to state variable notation,
    and update the line number references (in the names of predicates created
    from lambda expressions) accordingly.

tests/hard_coded/uint64_ground_term.{m,exp}:
    A new test case to check that uint64 values too large to be int64 values
    can be stored in static structures.

tests/hard_coded/Mmakefile:
    Enable the new test case.
2018-05-05 13:22:19 +02:00
Zoltan Somogyi
c6074649f5 Clarify a comment. 2018-04-20 10:04:42 +10:00
Zoltan Somogyi
15aa457e12 Delete $module arg from calls to unexpected. 2018-04-07 18:25:43 +10:00
Zoltan Somogyi
16fb9f9edf Delete the backends' own ptag types.
compiler/llds.m:
compiler/mlds.m:
    Delete the tag type from llds.m, and the mlds_ptag type from mlds.m.
    Replace their uses with the ptag type defined in hlds_data.m.

compiler/code_loc_dep.m:
compiler/llds_out_data.m:
compiler/llds_out_instr.m:
compiler/ml_unify_gen.m:
compiler/mlds_to_c.m:
compiler/peephole.m:
compiler/unify_gen.m:
compiler/var_locn.m:
    Conform to the changes above.
2018-03-07 18:09:50 +11:00
Zoltan Somogyi
17432002c9 Simplify the handling of tags on new_objects.
compiler/mlds.m:
    Change the field of the new_object MLDS statement that deals with tags
    from being a maybe(mlds_ptag) to being just an mlds_ptag.

    This field is not actually used by any of mlds_to_{c,cs,java}.m.
    It is used only by ml_accurate_gc.m, to decide whether to tag
    a copied pointer. For that purpose, testing whether the ptag is zero
    is just as easy as testing whether the maybe ptag is yes(...),
    and ml_unify_gen.m used to set this field to yes(...) if and only if
    the ptag was not zero anyway. The change allows the compiler to avoid
    allocating a yes(...) cell for every new_object statement it generates.

    Clarify a related comment.

compiler/ml_unify_gen.m:
    Pass ptags instead of maybe ptags to the predicates that create new_object
    statements. We can do this because we always know what ptag we are about to
    put on the new object we are allocating. Setting MaybePtag to no
    when Ptag was 0 was effectively forgetting this fact.

    Not forgetting allows us to generate better code when generating code
    for a unification involved in the LCMC optimization. If the functor
    involved in the LCMC has primary tag 0, the address we compute
    for the field to be filled in used to have a MaybePtag = no
    in the first argument of the ml_field rval, which meant that
    the primary tag had to be masked off at runtime. We now generate
    an ml_field rval that has yes(0) as its first field, which tells the
    runtime that the primary tags bits *don't* have to be masked off.
    This is shown by this diff, from just before the call to qualify_inst
    generated for the call to from qualify_mode in module_qual.qualify_items.m

    -   AddrInstB_40 = (MR_Word *) &(MR_hl_mask_field((MR_Word) *Mode_12,
            (MR_Integer) 1));
    +   AddrInstB_40 = (MR_Word *) &(MR_hl_field(MR_mktag(0), *Mode_12,
            (MR_Integer) 1));

    Reduce superficial differences between two pieces of code doing
    very similar tasks, as a prelude to factoring them out.

compiler/ml_accurate_gc.m:
compiler/ml_elim_nested.m:
compiler/ml_optimize.m:
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
    Conform to the change in mlds.m.
2018-03-06 01:25:33 +11:00
Zoltan Somogyi
0dce068b9a Give a type a less ambiguous name.
compiler/mlds.m:
    Rename the type mlds_tag as mlds_ptag, since it stands for primary tags
    (whose standard abbreviation inside the compiler is ptag), and the
    process of generating MLDS also handles two other kinds of tags:
    secondary tags and cons_tags.

compiler/ml_accurate_gc.m:
compiler/ml_elim_nested.m:
compiler/ml_unify_gen.m:
compiler/mlds_to_c.m:
    Conform to the change in mlds.m.
2018-03-01 00:15:00 +11:00
Zoltan Somogyi
8d0597f97d Use typed rvals in the MLDS code generator.
compiler/mlds.m:
    Introduce the concept of typed rvals to the MLDS (it is already present
    in the LLDS).

    To start with, use typed rvals to represent the arguments of the new_object
    operation.

compiler/ml_accurate_gc.m:
compiler/ml_closure_gen.m:
compiler/ml_elim_nested.m:
compiler/ml_optimize.m:
compiler/ml_rename_classes.m:
compiler/ml_unify_gen.m:
compiler/ml_util.m:
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/mlds_to_target_util.m:
    Conform to the change in the new_object operation. In most places,
    this means replacing two lists (one of rvals, and one of types)
    that had to have the same length, with one list of typed rvals.
    The use of typed rvals thus encodes a required invariant in the
    type of the data, making its violation impossible. It also means
    that many places that used to require iterating on two lists
    simultaneously can be replaced by a simple iteration on one list only.
2018-02-28 22:19:17 +11:00
Zoltan Somogyi
5fa2be7075 Fix a comment. 2018-02-28 17:48:11 +11:00
Zoltan Somogyi
dfe108251a Simplify the representation of foreign_export_enums.
compiler/hlds_module.m:
    The backends need to know how each constant in an enum type
    is represented if that type has a foreign_export_enum pragma.
    Transmit that information to them using the ctor_repns of the constants,
    so that they don't have to recreate it from the name of the constant
    and the cons_id_to_tag_map.

compiler/ml_type_gen.m:
    Conform to the change in hlds_module.m.

    Factor out a switch converting enums' tags to rvals that was duplicated
    in two places.

compiler/export.m:
    Conform to the change in hlds_module.m.

    Include the foreign name of an enum constant with its representation;
    this simplifies the code.

compiler/add_foreign_enum.m:
    Conform to the change in hlds_module.m.

compiler/mlds.m:
    Document the reason why the MLDS representation of exported enums
    needs to know the type constructor.
2018-02-21 20:15:47 +11:00
Julien Fischer
f80463dbcb Add builtin 64-bit integer types -- Part 2.
Replace placeholder types with int64 and uint64 as appropriate throughout the
system.

Enable support for 64-bit integer literals in the compiler.

Add initial library support for 64-bit integers.

configure.ac:
     Check that the bootstrap compiler recognises int64 and uint64 as
     builtins.

library/int64.m:
library/uint64.m:
     Populate these two modules to the extent that we can now run
     basic tests of 64-bit integer support.

     Note that since the bootstrap compiler will not recognise
     64-bit integer literals, any such literals are current written
     as conversions from ints; this will be replaced once this change
     has bootstrapped.

library/private_builtin.m:
    Replace the placeholder definitions for builtin unification and
    comparison of 64-bit integers with their actual definitions.

library/integer.m:
    Add procedures for converting integers to- and from int64 and uint64.

library/string.m:
    Add functions for converting 64-bit integers into strings.

library/io.m:
    Add predicates for writing 64-bit integers to text streams.
    (Support for 64-bit integers with binary streams will be done
    separately.)

library/stream.string_writer.m:
    Add put_int64/4 and put_uint/64.

    Extend the implementations of print and write to cover int64 and
    uint64.

library/pprint.m:
    Make int64 and uint64 instances of the doc/1 type class.

library/erlang_rtti_implementation.m:
library/rtti_implementation.m:
    Handle int64 and uint64 properly in deconstruct.

library/term.m:
    Add functions for converting 64-bit integers into terms.

library/term_conversion.m:
    Support int64 and uint64 in univ -> term conversion.

library/Mercury.options:
    Avoid a warning about the import of the require being
    unused in the int64 and uint64 modules.  It *is* used,
    but only in the definitions used by the Erlang backend.

compiler/superhomogeneous.m:
     Accept 64-bit integer literals.

compiler/c_util.m:
     In C, write out the value of the min_int64 as the symbolic
     constant INT64_MIN.  This expands in such a way as to avoid
     generating warnings from the C compiler.

compiler/builtin_ops.m:
compiler/bytecode.m:
compiler/elds.m:
compiler/elds_to_erlang.m:
compiler/hlds_data.m:
compiler/hlds_out_util.m:
compiler/llds.m:
compiler/llds_out_data.m:
compiler/lookup_switch.m:
compiler/mercury_to_mercury.m:
compiler/mlds.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/opt_debug.m:
compiler/parse_tree_out_info.m:
compiler/parse_tree_to_term.m:
compiler/prog_data.m:
compiler/prog_out.m:
compiler/prog_rep.m:
     Replace the use of int as a placeholder with int64 or uint64 as
     appropriate.

tests/hard_coded/Mmakefile:
tests/hard_coded/arith_int64.{m,exp}:
tests/hard_coded/arith_uint64.{m,exp}:
tests/hard_coded/bitwise_int64.{m,exp}:
tests/hard_coded/bitwise_uint64.{m,exp}:
tests/hard_coded/cmp_int64.{m,exp}:
tests/hard_coded/cmp_uint64.{m,exp}:
tests/hard_coded/integer_int64_conv.{m,exp}:
tests/hard_coded/integer_uint64_conv.{m,exp}:
     Add tests of basic operations on 64-bit integers.

tests/hard_coded/construct_test.{m,exp}:
    Extend this test to cover 64-bit integers.
2018-02-02 10:33:25 -05:00
Julien Fischer
a9b26c923c Support 64-bit integers in static ground terms with the MLDS backend.
compiler/ml_code_util.m:
    As above

compiler/mlds.m:
    Extend the type describing the different kinds of global variable
    holding constant immutable data to convert 64-bit integers.
2018-01-26 07:21:09 -05:00
Julien Fischer
f519e26173 Add builtin 64-bit integer types -- Part 1.
Add the new builtin types: int64 and uint64.

Support for these new types will need to be bootstrapped over several changes.
This is the first such change and does the following:

- Extends the compiler to recognise 'int64' and 'uint64' as builtin types.
- Extends the set of builtin arithmetic, bitwise and relational operators
  to cover the new types.
- Adds the new internal option '--unboxed-int64s' to the compiler; this will be
  used to control whether 64-bit integer types are boxed or not.
- Extends all of the code generators to handle the new types.
- Extends the runtimes to support the new types.
- Adds new modules to the standard library intend to contain basic operations
  on the new types.  (These are currently empty and not documented.)

There are bunch of limitations marks with "XXX INT64"; these will be lifted in
part 2 of this change.  Also, 64-bit integer types are currently always boxed,
again this limitation will be lifted in later changes.

compiler/options.m:
    Add the new option --unboxed-int64s.

compiler/prog_type.m:
compiler/prog_data.m:
compiler/builtin_lib_types.m:
     Recognise int64 and uint64 as builtin types.

compiler/builtin_ops.m:
     Add builtin operations for the new types.

compiler/hlds_data.m:
     Add new tag types for the new types.

compiler/ctgc.selector.m:
compiler/dead_proc_elim.m:
compiler/export.m:
compiler/foreign.m:
compiler/goal_util.m:
compiler/higher_order.m:
compiler/hlds_code_util.m:
compiler/hlds_dependency_graph.m:
compiler/hlds_out_pred.m:
compiler/hlds_out_util.m:
compiler/implementation_defined_literals.m:
compiler/inst_check.m:
compiler/mercury_to_mercury.m:
compiler/mode_util.m:
compiler/module_qual.qualify_items.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/parse_tree_to_term.m:
compiler/parse_type_name.m:
compiler/polymorphism.m:
compiler/prog_out.m:
compiler/prog_util.m:
compiler/rbmm.execution_path.m:
compiler/rtti.m:
compiler/table_gen.m:
compiler/type_util.m:
compiler/typecheck.m:
compiler/unify_gen.m:
compiler/unify_proc.m:
compiler/unused_imports.m:
compiler/xml_documentation.m:
    Conform to the above changes to the parse tree and HLDS.

compiler/c_util.m:
    Support writing out constants of the new types.

compiler/llds.m:
    Add a representation for constants of the new types to the LLDS.

compiler/stack_layout.m:
    Add a new field to the stack layout params that records whether
    64-bit integers are boxed or not.

compiler/call_gen.:m
compiler/code_info.m:
compiler/disj_gen.m:
compiler/dupproc.m:
compiler/exprn_aux.m:
compiler/global_data.m:
compiler/jumpopt.m:
compiler/llds_out_data.m:
compiler/llds_out_instr.m:
compiler/lookup_switch.m:
compiler/mercury_compile_llds_back_end.m:
compiler/prog_rep.m:
compiler/prog_rep_tables.m:
compiler/var_locn.m b/compiler/var_locn.m:
    Support the new types in the LLDS code generator.

compiler/mlds.m:
    Support constants of the new types in the MLDS.

compiler/ml_call_gen.m:
compiler/ml_code_util.m:
compiler/ml_global_data.m:
compiler/ml_rename_classes.m:
compiler/ml_top_gen.m:
compiler/ml_type_gen.m:
compiler/ml_unify_gen.m:
compiler/ml_util.m:
compiler/mlds_to_target_util.m:
compiler/rtti_to_mlds.m:
     Conform to the above changes to the MLDS.

compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
    Generate the appropriate target code for constants of the new types
    and operations involving them.

compiler/bytecode.m:
compiler/bytecode_gen.m:
    Handle the new types in the bytecode generator; we just abort if we
    encounter them for now.

compiler/elds.m:
compiler/elds_to_erlang.m:
compiler/erl_call_gen.m:
compiler/erl_code_util.m:
compiler/erl_unify_gen.m:
    Handle the new types in the Erlang code generator.

library/private_builtin.m:
    Add placeholders for the builtin unify and compare operations for
    the new types.  Since the bootstrapping compiler will not recognise
    the new types we give them polymorphic arguments.  These can be
    replaced after this change has bootstrapped.

    Update the Java list of TypeCtorRep constants here.

library/int64.m:
library/uint64.m:
    New modules that will eventually contain builtin operations on the new
    types.

library/library.m:
library/MODULES_UNDOC:
    Do not include the above modules in the library documentation for now.

library/construct.m:
library/erlang_rtti_implementation.m:
library/rtti_implementation.m:
library/table_statistics.m:
deep_profiler/program_representation_utils.m:
mdbcomp/program_representation.m:
    Handle the new types.

configure.ac:
runtime/mercury_conf.h.in:
    Define the macro MR_BOXED_INT64S.  For now it is always defined, support for
    unboxed 64-bit integers will be enabled in a later change.

runtime/mercury_dotnet.cs.in:
java/runtime/TypeCtorRep.java:
runtime/mercury_type_info.h:
    Update the list of type_ctor reps.

runtime/mercury.h:
runtime/mercury_int.[ch]:
    Add macros for int64 / uint64 -> MR_Word conversion, boxing and
    unboxing.

    Add functions for hashing 64-bit integer types suitable for use
    with the tabling mechanism.

runtime/mercury_tabling.[ch]:
    Add additional HashTableSlot structs for 64-bit integer types.

    Omit the '%' character from the conversion specifiers we pass via
    the 'key_format' argument to the macros that generate the table lookup
    function.  This is so we can use the C99 exact size integer conversion
    specifiers (e.g. PRIu64 etc.) directly here.

runtime/mercury_hash_lookup_or_add_body.h:
    Add the '%' character that was omitted above to the call to debug_key_msg.

runtime/mercury_memory.h:
     Add new builtin allocation sites for boxed 64-bit integer types.

runtime/mercury_builtin_types.[ch]:
runtime/mercury_builitn_types_proc_layouts.h:
runtime/mercury_construct.c:
runtime/mercury_deconstruct.c:
runtime/mercury_deep_copy_body.h:
runtime/mercury_ml_expand_body.h:
runtime/mercury_table_type_body.h:
runtime/mercury_tabling_macros.h:
runtime/mercury_tabling_preds.h:
runtime/mercury_term_size.c:
runtime/mercury_unify_compare_body.h:
    Add the new builtin types and handle them throughout the runtime.

runtime/Mmakefile:
    Add mercury_int.c to the list of .c files.

doc/reference_manual.texi:
     Add the new types to the list of reserved type names.

     Add the mapping from the new types to their target language types.
     These are commented out for now.
2018-01-12 09:29:24 -05:00
Zoltan Somogyi
f1906ece65 Fix some too-long lines. 2017-12-14 14:12:30 +11:00
Zoltan Somogyi
dc4196e5af Separate breaks from loops and breaks from switches in MLDS.
compiler/mlds.m:
    Replace goto_break with goto_break_loop and goto_break_switch, each
    intended to break from a particular construct. It was confusion between
    the two kinds of breaks that led to the earlier bug that broke
    --prefer-while-loop-over-jump-mutual; this separation should make
    such bugs easy to detect.

    Rename goto_continue as goto_continue_loop to match the new naming scheme.

compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
    When emitting goto_break_loop and goto_break_switch, check whether
    the nearest enclosing break-able scope is a loop or switch respectively.

    To make this check possible, record the nearest break-able scope.

    While these additions make the compiler do extra work, the performance
    impact is negligible.

compiler/mlds_to_target_util.m:
    Add the type that mlds_to_{c,cs,java}.m all use to identify
    break-able scopes.

compiler/ml_call_gen.m:
compiler/ml_proc_gen.m:
compiler/ml_string_switch.m:
    Update the code that generates gotos.
2017-11-11 12:35:54 +11:00
Zoltan Somogyi
234501be75 Remove ml_tailcall.m and associated code.
Now that we can optimize tail recursion for all MLDS targets better via
the MLDS code generator than via ml_tailcall.m, we don't need it anymore.

compiler/ml_tailcall.m:
    Delete this module.

compiler/ml_backend.m:
compiler/notes/compiler_design.html:
    Delete the inclusion and the documentation of the deleted module.

compiler/mark_tail_calls.m:
    Update old references to the deleted module, as well as some comments.

compiler/mercury_compile_mlds_back_end.m:
    Don't invoke the deleted module.

compiler/options.m:
    Delete the (developer-only) options that used to control whether
    we did tail call optimization (TCO) via ml_tailcall.m or not.

compiler/ml_optimize.m:
    Delete the parts of this module that worked in concert with ml_tailcall.m
    to implement TCO.

compiler/mlds.m:
    Delete the field from ml_call_stmts that was needed only by ml_tailcall.m.

compiler/ml_call_gen.m:
    Don't fill in the deleted field.

    Shift here the only part of the old contents of ml_tailcall.m that is
    still needed, the check for whether rvals would become dangling references
    if we discarded the current call's stack frame.

compiler/ml_elim_nested.m:
    Conform to the change to mlds.m, and eliminate an unused field
    in elim_info.

compiler/ml_accurate_gc.m:
compiler/ml_code_util.m:
compiler/ml_commit_gen.m:
compiler/ml_proc_gen.m:
compiler/ml_rename_classes.m:
compiler/ml_util.m:
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/mlds_to_target_util.m:
    Conform to the changes above.
2017-11-10 14:26:32 +11:00
Zoltan Somogyi
7b0ca6345f Encode invariants about class inheritance in types.
compiler/mlds.m:
    Make mlds_interface_id its own type, instead of identifying all
    interfaces by an MLDS type using the mlds_class_type/3 data constructors.

    Make mlds_class_id its own type, instead of identifying (almost all)
    classes by an MLDS type using the mlds_class_type/3 data constructors.

    Change the field of mlds_class_defns that says what base classes the
    class inherits from to reflect the facts that

    - the "classes" representing environments that we put on the heap
      when targeting C# or Java have a base *type*, not a base *class*, and

    - no current MLDS target language supports multiple inheritance,
      so an MLDS class cannot inherit from more than one base class.

    Change the mlds_class_type data constructor of the mlds_type type
    to take a complete mlds_class_id as an argument, instead of its pieces.

compiler/ml_accurate_gc.m:
compiler/ml_code_util.m:
compiler/ml_elim_nested.m:
compiler/ml_global_data.m:
compiler/ml_lookup_switch.m:
compiler/ml_rename_classes.m:
compiler/ml_simplify_switch.m:
compiler/ml_type_gen.m:
compiler/ml_unify_gen.m:
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
    Conform to the changes above.
2017-10-19 19:13:28 +11:00
Zoltan Somogyi
e3a4968746 Delete a misleadingly named type.
compiler/mlds.m:
    Delete the type "mlds_type_name". Despite its name, it is not the name
    of a type in either the HLDS (mer_type) or MLDS (mlds_type) sense;
    it is the name of a class and its arity.

compiler/ml_elim_nested.m:
compiler/ml_global_data.m:
compiler/ml_rename_classes.m:
compiler/ml_type_gen.m:
compiler/mlds_to_c.m:
compiler/mlds_to_cs.m:
compiler/mlds_to_java.m:
compiler/mlds_to_target_util.m:
    Replace uses of mlds_type_name with the class name and its arity.
    In most cases, the new code is clearer as well as faster (since it avoids
    creating or traversing a memory cell).

    In ml_global_data.m, put some code into into execution order.
2017-10-15 04:34:31 +11:00
Zoltan Somogyi
11a65f226a Delete a redundant type synonym. 2017-10-15 02:59:21 +11:00