mirror of
https://github.com/Mercury-Language/mercury.git
synced 2025-12-08 18:34:00 +00:00
Estimated hours taken: 400
Branches: main
Implement the infrastructure for term size profiling. This means adding two
new grade components, tsw and tsc, and implementing them in the LLDS code
generator. In grades including tsw (term size words), each term is augmented
with an extra word giving the number of heap words it contains; in grades
including tsc (term size cells), each term is augmented with an extra word
giving the number of heap cells it contains. The extra word is at the start,
at offset -1, to leave almost all of the machinery for accessing the heap
unchanged.
For now, the only way to access term sizes is with a new mdb command,
"term_size <varspec>". Later, we will use term sizes in conjunction with
deep profiling to do experimental complexity analysis, but that requires
a lot more research. This diff is a necessary first step.
The implementation of term size profiling consists of three main parts:
- a source-to-source transform that computes the size of each heap cell
when it is constructed (and increments it in the rare cases when a free
argument of an existing heap cell is bound),
- a relatively small change to the code generator that reserves the extra
slot in new heap cells, and
- extensions to the facilities for creating cells from C code to record
the extra information we now need.
The diff overhauls polymorphism.m to make the source-to-source transform
possible. This overhaul includes separating type_ctor_infos and type_infos
as strictly as possible from each other, converting type_ctor_infos into
type_infos only as necessary. It also includes separating type_ctor_infos,
type_infos, base_typeclass_infos and typeclass_infos (as well as voids,
for clarity) from plain user-defined type constructors in type categorizations.
This change needs this separation because values of those four types do not
have size slots, but they ought to be treated specially in other situations
as well (e.g. by tabling).
The diff adds a new mdb command, term_size. It also replaces the proc_body
mdb command with new ways of using the existing print and browse commands
("print proc_body" and "browse proc_body") in order to make looking at
procedure bodies more controllable. This was useful in debugging the effect
of term size profiling on some test case outputs. It is not strictly tied
to term size profiling, but turns out to be difficult to disentangle.
compiler/size_prof.m:
A new module implementing the source-to-source transform.
compiler/notes/compiler_design.html:
Mention the new module.
compiler/transform_hlds.m:
Include size_prof as a submodule of transform_hlds.
compiler/mercury_compile.m:
If term size profiling is enabled, invoke its source-to-source
transform.
compiler/hlds_goal.m:
Extend construction unifications with an optional slot for recording
the size of the term if the size is a constant, or the identity of the
variable holding the size, if the size is not constant. This is
needed by the source-to-source transform.
compiler/quantification.m:
Treat the variable reference that may be in this slot as a nonlocal
variable of construction unifications, since the code generator needs
this.
compiler/compile_target_code.m:
Handle the new grade components.
compiler/options.m:
Implement the options that control term size profiling.
doc/user_guide.texi:
Document the options and grade components that control term size
profiling, and the term_size mdb command. The documentation is
commented out for now.
Modify the wording of the 'u' HLDS dump flag to include other details
of unifications (e.g. term size info) rather than just unification
categories.
Document the new alternatives of the print and browse commands. Since
they are for developers only, the documentation is commented out.
compiler/handle_options.m:
Handle the implications of term size profiling grades.
Add a -D flag value to print HLDS components relevant to HLDS
transformations.
compiler/modules.m:
Import the new builtin library module that implements the operations
needed by term size profiling automatically in term size profiling
grades.
Switch the predicate involved to use state var syntax.
compiler/prog_util.m:
Add predicates and functions that return the sym_names of the modules
needed by term size profiling.
compiler/code_info.m:
compiler/unify_gen.m:
compiler/var_locn.m:
Reserve an extra slot in heap cells and fill them in in unifications
marked by size_prof.
compiler/builtin_ops.m:
Add term_size_prof_builtin.term_size_plus as a builtin, with the same
implementation as int.+.
compiler/make_hlds.m:
Disable warnings about clauses for builtins while the change to
builtin_ops is bootstrapped.
compiler/polymorphism.m:
Export predicates that generate goals to create type_infos and
type_ctor_infos to add_to_construct.m. Rewrite their documentation
to make it more detailed.
Make orders of arguments amenable to the use of state variable syntax.
Consolidate knowledge of which type categories have builtin unify and
compare predicates in one place.
Add code to leave the types of type_ctor_infos alone: instead of
changing their types to type_info when used as arguments of other
type_infos, create a new variable of type type_info instead, and
use an unsafe_cast. This would make the HLDS closer to being type
correct, but this new code is currently commented out, for two
reasons. First, common.m is currently not smart enough to figure out
that if X and Y are equal, then similar unsafe_casts of X and Y
are also equal, and this causes the compiler do not detect some
duplicate calls it used to detect. Second, the code generators
are also not smart enough to know that if Z is an unsafe_cast of X,
then X and Z do not need separate stack slots, but can use the same
slot.
compiler/type_util.m:
Add utility predicates for returning the types of type_infos and
type_ctor_infos, for use by new code in polymorphism.m.
Move some utility predicates here from other modules, since they
are now used by more than one module.
Rename the type `builtin_type' as `type_category', to better reflect
what it does. Extend it to put the type_info, type_ctor_info,
typeclass_info, base_typeclass_info and void types into categories
of their own: treating these types as if they were a user-defined
type (which is how they used to be classified) is not always correct.
Rename the functor polymorphic_type to variable_type, since types
such as list(T) are polymorphic, but they fall into the user-defined
category. Rename user_type as user_ctor_type, since list(int) is not
wholly user-defined but falls into this category. Rename pred_type
as higher_order_type, since it also encompasses functions.
Replace code that used to check for a few of the alternatives
of this type with code that does a full switch on the type,
to ensure that they are updated if the type definition ever
changes again.
compiler/pseudo_type_info.m:
Delete a predicate whose updated implementation is now in type_util.m.
compiler/mlds_to_c.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/mlds_to_java.m:
Still treat type_infos, type_ctor_infos, typeclass_infos and
base_typeclass_infos as user-defined types, but prepare for when
they won't be.
compiler/hlds_pred.m:
Require interface typeinfo liveness when term size profiling is
enabled.
Add term_size_profiling_builtin.increase_size as a
no_type_info_builtin.
compiler/hlds_out.m:
Print the size annotations on unifications if HLDS dump flags call
for unification details. (The flag test is in the caller of the
modified predicate.)
compiler/llds.m:
Extend incr_hp instructions and data_addr_consts with optional fields
that allow the code generator to refer to N words past the start of
a static or dynamic cell. Term size profiling uses this with N=1.
compiler/llds_out.m:
When allocating memory on the heap, use the macro variants that
specify an optional offset, and specify the offset when required.
compiler/bytecode_gen.m:
compiler/dense_switch.m:
compiler/dupelim.m:
compiler/exprn_aux.m:
compiler/goal_form.m:
compiler/goal_util.m:
compiler/higher_order.m:
compiler/inst_match.m:
compiler/intermod.m:
compiler/jumpopt.m:
compiler/lambda.m:
compiler/livemap.m:
compiler/ll_pseudo_type_info.m:
compiler/lookup_switch.m:
compiler/magic_util.m:
compiler/middle_rec.m:
compiler/ml_code_util.m:
compiler/ml_switch_gen.m:
compiler/ml_unify_gen.m:
compiler/mlds.m:
compiler/mlds_to_c.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/mlds_to_java.m:
compiler/modecheck_unify.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/par_conj_gen.m:
compiler/post_typecheck.m:
compiler/reassign.m:
compiler/rl.m:
compiler/rl_key.m:
compiler/special_pred.m:
compiler/stack_layout.m:
compiler/static_term.m:
compiler/string_switch.m:
compiler/switch_gen.m:
compiler/switch_util.m:
compiler/table_gen.m:
compiler/term_util.m:
compiler/type_ctor_info.m:
compiler/unused_args.m:
compiler/use_local_vars.m:
Minor updates to conform to the changes above.
library/term_size_prof_builtin.m:
New module containing helper predicates for term size profiling.
size_prof.m generates call to these predicates.
library/library.m:
Include the new module in the library.
doc/Mmakefile:
Do not include the term_size_prof_builtin module in the library
documentation.
library/array.m:
library/benchmarking.m:
library/construct.m:
library/deconstruct.m:
library/io.m:
library/sparse_bitset.m:
library/store.m:
library/string.m:
Replace all uses of MR_incr_hp with MR_offset_incr_hp, to ensure
that we haven't overlooked any places where offsets may need to be
specified.
Fix formatting of foreign_procs.
Use new macros defined by the runtime system when constructing
terms (which all happen to be lists) in C code. These new macros
specify the types of the cell arguments, allowing the implementation
to figure out the size of the new cell based on the sizes of its
fields.
library/private_builtin.m:
Define some constant type_info structures for use by these macros.
They cannot be defined in the runtime, since they refer to types
defined in the library (list.list and std_util.univ).
util/mkinit.c:
Make the addresses of these type_info structures available to the
runtime.
runtime/mercury_init.h:
Declare these type_info structures, for use in mkinit-generated
*_init.c files.
runtime/mercury_wrapper.[ch]:
Declare and define the variables that hold these addresses, for use
in the new macros for constructing typed lists.
Since term size profiling can refer to a memory cell by a pointer
that is offset by one word, register the extra offsets with the Boehm
collector if is being used.
Document the incompatibility of MR_HIGHTAGS and the Boehm collector.
runtime/mercury_tags.h:
Define new macros for constructing typed lists.
Provide macros for preserving the old interface presented by this file
to the extent possible. Uses of the old MR_list_cons macro will
continue to work in grades without term size profiling. In term
size profiling grades, their use will get a C compiler error.
Fix a bug caused by a missing backslash.
runtime/mercury_heap.h:
Change the basic macros for allocating new heap cells to take
an optional offset argument. If this is nonzero, the macros
increment the returned address by the given number of words.
Term size profiling specifies offset=1, reserving the extra
word at the start (which is ignored by all components of the
system except term size profiling) for holding the size of the term.
Provide macros for preserving the old interface presented by this file
to the extent possible. Since the old MR_create[123] and MR_list_cons
macros did not specify type information, they had to be changed
to take additional arguments. This affects only hand-written C code.
Call new diagnostic macros that can help debug heap allocations.
Document why the macros in this files must expand to expressions
instead of statements, evn though the latter would be preferable
(e.g. by allowing them to declare and use local variables without
depending on gcc extensions).
runtime/mercury_debug.[ch]:
Add diagnostic macros to debug heap allocations, and the functions
behind them if MR_DEBUG_HEAP_ALLOC is defined.
Update the debugging routines for hand-allocated cells to print the
values of the term size slot as well as the other slots in the relevant
grades.
runtime/mercury_string.h:
Provide some needed variants of the macro for copying strings.
runtime/mercury_deconstruct_macros.h:
runtime/mercury_type_info.c:
Supply type information when constructing terms.
runtime/mercury_deep_copy_body.h:
Preserve the term size slot when copying terms.
runtime/mercury_deep_copy_body.h:
runtime/mercury_ho_call.c:
runtime/mercury_ml_expand_body.h:
Use MR_offset_incr_hp instead of MR_incr_hp to ensure that all places
that allocate cells also allocate space for the term size slot if
necessary.
Reduce code duplication by using a now standard macro for copying
strings.
runtime/mercury_grade.h:
Handle the two new grade components.
runtime/mercury_conf_param.h:
Document the C macros used to control the two new grade components,
as well as MR_DEBUG_HEAP_ALLOC.
Detect incompatibilities between high level code and profiling.
runtime/mercury_term_size.[ch]:
A new module to house a function to find and return term sizes
stored in heap cells.
runtime/mercury_proc_id.h:
runtime/mercury_univ.h:
New header files. mercury_proc_id.h contains the (unchanged)
definition of MR_Proc_Id, while mercury_univ.h contains the
definitions of the macros for manipulating univs that used to be
in mercury_type_info.h, updated to use the new macros for allocating
memory.
In the absence of these header files, the following circularity
would ensue:
mercury_deep_profiling.h includes mercury_stack_layout.h
- needs definition of MR_Proc_Id
mercury_stack_layout.h needs mercury_type_info.h
- needs definition of MR_PseudoTypeInfo
mercury_type_info.h needs mercury_heap.h
- needs heap allocation macros for MR_new_univ_on_hp
mercury_heap.h includes mercury_deep_profiling.h
- needs MR_current_call_site_dynamic for recording allocations
Breaking the circular dependency in two places, not just one, is to
minimize similar problems in the future.
runtime/mercury_stack_layout.h:
Delete the definition of MR_Proc_Id, which is now in mercury_proc_id.h.
runtime/mercury_type_info.h:
Delete the macros for manipulating univs, which are now in
mercury_univ.h.
runtime/Mmakefile:
Mention the new files.
runtime/mercury_imp.h:
runtime/mercury.h:
runtime/mercury_construct.c:
runtime/mercury_deep_profiling.h:
Include the new files at appropriate points.
runtime/mercury.c:
Change the names of the functions that create heap cells for
hand-written code, since the interface to hand-written code has
changed to include type information.
runtime/mercury_tabling.h:
Delete some unused macros.
runtime/mercury_trace_base.c:
runtime/mercury_type_info.c:
Use the new macros supplying type information when constructing lists.
scripts/canonical_grade_options.sh-subr:
Fix an undefined sh variable bug that could cause error messages
to come out without identifying the program they were from.
scripts/init_grade_options.sh-subr:
scripts/parse_grade_options.sh-subr:
scripts/canonical_grade_options.sh-subr:
scripts/mgnuc.in:
Handle the new grade components and the options controlling them.
trace/mercury_trace_internal.c:
Implement the mdb command "term_size <varspec>", which is like
"print <varspec>", but prints the size of a term instead of its value.
In non-term-size-profiling grades, it prints an error message.
Replace the "proc_body" command with optional arguments to the "print"
and "browse" commands.
doc/user_guide.tex:
Add documentation of the term_size mdb command. Since the command is
for implementors only, and works only in grades that are not yet ready
for public consumption, the documentation is commented out.
Add documentation of the new arguments of the print and browse mdb
commands. Since they are for implementors only, the documentation
is commented out.
trace/mercury_trace_vars.[ch]:
Add the functions needed to implement the term_size command, and
factor out the code common to the "size" and "print"/"browse" commands.
Decide whether to print the name of a variable before invoking the
supplied print or browse predicate on it based on a flag design for
this purpose, instead of overloading the meaning of the output FILE *
variable. This arrangement is much clearer.
trace/mercury_trace_browse.c:
trace/mercury_trace_external.c:
trace/mercury_trace_help.c:
Supply type information when constructing terms.
browser/program_representation.m:
Since the new library module term_size_prof_builtin never generates
any events, mark it as such, so that the declarative debugger doesn't
expect it to generate any.
Do the same for the deep profiling builtin module.
tests/debugger/term_size_words.{m,inp,exp}:
tests/debugger/term_size_cells.{m,inp,exp}:
Two new test cases, each testing one of the new grades.
tests/debugger/Mmakefile:
Enable the two new test cases in their grades.
Disable the tests sensitive to stack frame sizes in term size profiling
grades.
tests/debugger/completion.exp:
Add the new "term_size" mdb command to the list of command completions,
and delete "proc_body".
tests/debugger/declarative/dependency.{inp,exp}:
Use "print proc_body" instead of "proc_body".
tests/hard_coded/nondet_c.m:
tests/hard_coded/pragma_inline.m:
Use MR_offset_incr_hp instead of MR_incr_hp to ensure that all places
that allocate cells also allocate space for the term size slot if
necessary.
tests/valid/Mmakefile:
Disable the IL tests in term size profiling grades, since the term size
profiling primitives haven't been (and probably won't be) implemented
for the MLDS backends, and handle_options causes a compiler abort
for grades that combine term size profiling and any one of IL, Java
and high level C.