Files
mercury/util
Zoltan Somogyi f007b45df8 Implement the infrastructure for term size profiling.
Estimated hours taken: 400
Branches: main

Implement the infrastructure for term size profiling. This means adding two
new grade components, tsw and tsc, and implementing them in the LLDS code
generator. In grades including tsw (term size words), each term is augmented
with an extra word giving the number of heap words it contains; in grades
including tsc (term size cells), each term is augmented with an extra word
giving the number of heap cells it contains. The extra word is at the start,
at offset -1, to leave almost all of the machinery for accessing the heap
unchanged.

For now, the only way to access term sizes is with a new mdb command,
"term_size <varspec>". Later, we will use term sizes in conjunction with
deep profiling to do experimental complexity analysis, but that requires
a lot more research. This diff is a necessary first step.

The implementation of term size profiling consists of three main parts:

- a source-to-source transform that computes the size of each heap cell
  when it is constructed (and increments it in the rare cases when a free
  argument of an existing heap cell is bound),

- a relatively small change to the code generator that reserves the extra
  slot in new heap cells, and

- extensions to the facilities for creating cells from C code to record
  the extra information we now need.

The diff overhauls polymorphism.m to make the source-to-source transform
possible. This overhaul includes separating type_ctor_infos and type_infos
as strictly as possible from each other, converting type_ctor_infos into
type_infos only as necessary. It also includes separating type_ctor_infos,
type_infos, base_typeclass_infos and typeclass_infos (as well as voids,
for clarity) from plain user-defined type constructors in type categorizations.
This change needs this separation because values of those four types do not
have size slots, but they ought to be treated specially in other situations
as well (e.g. by tabling).

The diff adds a new mdb command, term_size. It also replaces the proc_body
mdb command with new ways of using the existing print and browse commands
("print proc_body" and "browse proc_body") in order to make looking at
procedure bodies more controllable. This was useful in debugging the effect
of term size profiling on some test case outputs. It is not strictly tied
to term size profiling, but turns out to be difficult to disentangle.

compiler/size_prof.m:
	A new module implementing the source-to-source transform.

compiler/notes/compiler_design.html:
	Mention the new module.

compiler/transform_hlds.m:
	Include size_prof as a submodule of transform_hlds.

compiler/mercury_compile.m:
	If term size profiling is enabled, invoke its source-to-source
	transform.

compiler/hlds_goal.m:
	Extend construction unifications with an optional slot for recording
	the size of the term if the size is a constant, or the identity of the
	variable holding the size, if the size is not constant. This is
	needed by the source-to-source transform.

compiler/quantification.m:
	Treat the variable reference that may be in this slot as a nonlocal
	variable of construction unifications, since the code generator needs
	this.

compiler/compile_target_code.m:
	Handle the new grade components.

compiler/options.m:
	Implement the options that control term size profiling.

doc/user_guide.texi:
	Document the options and grade components that control term size
	profiling, and the term_size mdb command. The documentation is
	commented out for now.

	Modify the wording of the 'u' HLDS dump flag to include other details
	of unifications (e.g. term size info) rather than just unification
	categories.

	Document the new alternatives of the print and browse commands. Since
	they are for developers only, the documentation is commented out.

compiler/handle_options.m:
	Handle the implications of term size profiling grades.

	Add a -D flag value to print HLDS components relevant to HLDS
	transformations.

compiler/modules.m:
	Import the new builtin library module that implements the operations
	needed by term size profiling automatically in term size profiling
	grades.

	Switch the predicate involved to use state var syntax.

compiler/prog_util.m:
	Add predicates and functions that return the sym_names of the modules
	needed by term size profiling.

compiler/code_info.m:
compiler/unify_gen.m:
compiler/var_locn.m:
 	Reserve an extra slot in heap cells and fill them in in unifications
	marked by size_prof.

compiler/builtin_ops.m:
	Add term_size_prof_builtin.term_size_plus as a builtin, with the same
	implementation as int.+.

compiler/make_hlds.m:
	Disable warnings about clauses for builtins while the change to
	builtin_ops is bootstrapped.

compiler/polymorphism.m:
	Export predicates that generate goals to create type_infos and
	type_ctor_infos to add_to_construct.m. Rewrite their documentation
	to make it more detailed.

	Make orders of arguments amenable to the use of state variable syntax.

	Consolidate knowledge of which type categories have builtin unify and
	compare predicates in one place.

	Add code to leave the types of type_ctor_infos alone: instead of
	changing their types to type_info when used as arguments of other
	type_infos, create a new variable of type type_info instead, and
	use an unsafe_cast. This would make the HLDS closer to being type
	correct, but this new code is currently commented out, for two
	reasons. First, common.m is currently not smart enough to figure out
	that if X and Y are equal, then similar unsafe_casts of X and Y
	are also equal, and this causes the compiler do not detect some
	duplicate calls it used to detect. Second, the code generators
	are also not smart enough to know that if Z is an unsafe_cast of X,
	then X and Z do not need separate stack slots, but can use the same
	slot.

compiler/type_util.m:
	Add utility predicates for returning the types of type_infos and
	type_ctor_infos, for use by new code in polymorphism.m.

	Move some utility predicates here from other modules, since they
	are now used by more than one module.

	Rename the type `builtin_type' as `type_category', to better reflect
	what it does. Extend it to put the type_info, type_ctor_info,
	typeclass_info, base_typeclass_info and void types into categories
	of their own: treating these types as if they were a user-defined
	type (which is how they used to be classified) is not always correct.
	Rename the functor polymorphic_type to variable_type, since types
	such as list(T) are polymorphic, but they fall into the user-defined
	category. Rename user_type as user_ctor_type, since list(int) is not
	wholly user-defined but falls into this category. Rename pred_type
	as higher_order_type, since it also encompasses functions.

	Replace code that used to check for a few of the alternatives
	of this type with code that does a full switch on the type,
	to ensure that they are updated if the type definition ever
	changes again.

compiler/pseudo_type_info.m:
	Delete a predicate whose updated implementation is now in type_util.m.

compiler/mlds_to_c.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/mlds_to_java.m:
	Still treat type_infos, type_ctor_infos, typeclass_infos and
	base_typeclass_infos as user-defined types, but prepare for when
	they won't be.

compiler/hlds_pred.m:
	Require interface typeinfo liveness when term size profiling is
	enabled.

	Add term_size_profiling_builtin.increase_size as a
	no_type_info_builtin.

compiler/hlds_out.m:
	Print the size annotations on unifications if HLDS dump flags call
	for unification details. (The flag test is in the caller of the
	modified predicate.)

compiler/llds.m:
	Extend incr_hp instructions and data_addr_consts with optional fields
	that allow the code generator to refer to N words past the start of
	a static or dynamic cell. Term size profiling uses this with N=1.

compiler/llds_out.m:
	When allocating memory on the heap, use the macro variants that
	specify an optional offset, and specify the offset when required.

compiler/bytecode_gen.m:
compiler/dense_switch.m:
compiler/dupelim.m:
compiler/exprn_aux.m:
compiler/goal_form.m:
compiler/goal_util.m:
compiler/higher_order.m:
compiler/inst_match.m:
compiler/intermod.m:
compiler/jumpopt.m:
compiler/lambda.m:
compiler/livemap.m:
compiler/ll_pseudo_type_info.m:
compiler/lookup_switch.m:
compiler/magic_util.m:
compiler/middle_rec.m:
compiler/ml_code_util.m:
compiler/ml_switch_gen.m:
compiler/ml_unify_gen.m:
compiler/mlds.m:
compiler/mlds_to_c.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/mlds_to_java.m:
compiler/modecheck_unify.m:
compiler/opt_debug.m:
compiler/opt_util.m:
compiler/par_conj_gen.m:
compiler/post_typecheck.m:
compiler/reassign.m:
compiler/rl.m:
compiler/rl_key.m:
compiler/special_pred.m:
compiler/stack_layout.m:
compiler/static_term.m:
compiler/string_switch.m:
compiler/switch_gen.m:
compiler/switch_util.m:
compiler/table_gen.m:
compiler/term_util.m:
compiler/type_ctor_info.m:
compiler/unused_args.m:
compiler/use_local_vars.m:
	Minor updates to conform to the changes above.

library/term_size_prof_builtin.m:
	New module containing helper predicates for term size profiling.
	size_prof.m generates call to these predicates.

library/library.m:
	Include the new module in the library.

doc/Mmakefile:
	Do not include the term_size_prof_builtin module in the library
	documentation.

library/array.m:
library/benchmarking.m:
library/construct.m:
library/deconstruct.m:
library/io.m:
library/sparse_bitset.m:
library/store.m:
library/string.m:
	Replace all uses of MR_incr_hp with MR_offset_incr_hp, to ensure
	that we haven't overlooked any places where offsets may need to be
	specified.

	Fix formatting of foreign_procs.

	Use new macros defined by the runtime system when constructing
	terms (which all happen to be lists) in C code. These new macros
	specify the types of the cell arguments, allowing the implementation
	to figure out the size of the new cell based on the sizes of its
	fields.

library/private_builtin.m:
	Define some constant type_info structures for use by these macros.
	They cannot be defined in the runtime, since they refer to types
	defined in the library (list.list and std_util.univ).

util/mkinit.c:
	Make the addresses of these type_info structures available to the
	runtime.

runtime/mercury_init.h:
	Declare these type_info structures, for use in mkinit-generated
	*_init.c files.

runtime/mercury_wrapper.[ch]:
	Declare and define the variables that hold these addresses, for use
	in the new macros for constructing typed lists.

	Since term size profiling can refer to a memory cell by a pointer
	that is offset by one word, register the extra offsets with the Boehm
	collector if is being used.

	Document the incompatibility of MR_HIGHTAGS and the Boehm collector.

runtime/mercury_tags.h:
	Define new macros for constructing typed lists.

	Provide macros for preserving the old interface presented by this file
	to the extent possible. Uses of the old MR_list_cons macro will
	continue to work in grades without term size profiling. In term
	size profiling grades, their use will get a C compiler error.

	Fix a bug caused by a missing backslash.

runtime/mercury_heap.h:
	Change the basic macros for allocating new heap cells to take
	an optional offset argument. If this is nonzero, the macros
	increment the returned address by the given number of words.
	Term size profiling specifies offset=1, reserving the extra
	word at the start (which is ignored by all components of the
	system except term size profiling) for holding the size of the term.

	Provide macros for preserving the old interface presented by this file
	to the extent possible. Since the old MR_create[123] and MR_list_cons
	macros did not specify type information, they had to be changed
	to take additional arguments. This affects only hand-written C code.

	Call new diagnostic macros that can help debug heap allocations.

	Document why the macros in this files must expand to expressions
	instead of statements, evn though the latter would be preferable
	(e.g. by allowing them to declare and use local variables without
	depending on gcc extensions).

runtime/mercury_debug.[ch]:
	Add diagnostic macros to debug heap allocations, and the functions
	behind them if MR_DEBUG_HEAP_ALLOC is defined.

	Update the debugging routines for hand-allocated cells to print the
	values of the term size slot as well as the other slots in the relevant
	grades.

runtime/mercury_string.h:
	Provide some needed variants of the macro for copying strings.

runtime/mercury_deconstruct_macros.h:
runtime/mercury_type_info.c:
	Supply type information when constructing terms.

runtime/mercury_deep_copy_body.h:
	Preserve the term size slot when copying terms.

runtime/mercury_deep_copy_body.h:
runtime/mercury_ho_call.c:
runtime/mercury_ml_expand_body.h:
	Use MR_offset_incr_hp instead of MR_incr_hp to ensure that all places
	that allocate cells also allocate space for the term size slot if
	necessary.

	Reduce code duplication by using a now standard macro for copying
	strings.

runtime/mercury_grade.h:
	Handle the two new grade components.

runtime/mercury_conf_param.h:
	Document the C macros used to control the two new grade components,
	as well as MR_DEBUG_HEAP_ALLOC.

	Detect incompatibilities between high level code and profiling.

runtime/mercury_term_size.[ch]:
	A new module to house a function to find and return term sizes
	stored in heap cells.

runtime/mercury_proc_id.h:
runtime/mercury_univ.h:
	New header files. mercury_proc_id.h contains the (unchanged)
	definition of MR_Proc_Id, while mercury_univ.h contains the
	definitions of the macros for manipulating univs that used to be
	in mercury_type_info.h, updated to use the new macros for allocating
	memory.

	In the absence of these header files, the following circularity
	would ensue:

	mercury_deep_profiling.h includes mercury_stack_layout.h
		- needs definition of MR_Proc_Id
	mercury_stack_layout.h needs mercury_type_info.h
		- needs definition of MR_PseudoTypeInfo
	mercury_type_info.h needs mercury_heap.h
		- needs heap allocation macros for MR_new_univ_on_hp
	mercury_heap.h includes mercury_deep_profiling.h
		- needs MR_current_call_site_dynamic for recording allocations

	Breaking the circular dependency in two places, not just one, is to
	minimize similar problems in the future.

runtime/mercury_stack_layout.h:
	Delete the definition of MR_Proc_Id, which is now in mercury_proc_id.h.

runtime/mercury_type_info.h:
	Delete the macros for manipulating univs, which are now in
	mercury_univ.h.

runtime/Mmakefile:
	Mention the new files.

runtime/mercury_imp.h:
runtime/mercury.h:
runtime/mercury_construct.c:
runtime/mercury_deep_profiling.h:
	Include the new files at appropriate points.

runtime/mercury.c:
	Change the names of the functions that create heap cells for
	hand-written code, since the interface to hand-written code has
	changed to include type information.

runtime/mercury_tabling.h:
	Delete some unused macros.

runtime/mercury_trace_base.c:
runtime/mercury_type_info.c:
	Use the new macros supplying type information when constructing lists.

scripts/canonical_grade_options.sh-subr:
	Fix an undefined sh variable bug that could cause error messages
	to come out without identifying the program they were from.

scripts/init_grade_options.sh-subr:
scripts/parse_grade_options.sh-subr:
scripts/canonical_grade_options.sh-subr:
scripts/mgnuc.in:
	Handle the new grade components and the options controlling them.

trace/mercury_trace_internal.c:
	Implement the mdb command "term_size <varspec>", which is like
	"print <varspec>", but prints the size of a term instead of its value.
	In non-term-size-profiling grades, it prints an error message.

	Replace the "proc_body" command with optional arguments to the "print"
	and "browse" commands.

doc/user_guide.tex:
	Add documentation of the term_size mdb command. Since the command is
	for implementors only, and works only in grades that are not yet ready
	for public consumption, the documentation is commented out.

	Add documentation of the new arguments of the print and browse mdb
	commands. Since they are for implementors only, the documentation
	is commented out.

trace/mercury_trace_vars.[ch]:
	Add the functions needed to implement the term_size command, and
	factor out the code common to the "size" and "print"/"browse" commands.

	Decide whether to print the name of a variable before invoking the
	supplied print or browse predicate on it based on a flag design for
	this purpose, instead of overloading the meaning of the output FILE *
	variable. This arrangement is much clearer.

trace/mercury_trace_browse.c:
trace/mercury_trace_external.c:
trace/mercury_trace_help.c:
	Supply type information when constructing terms.

browser/program_representation.m:
	Since the new library module term_size_prof_builtin never generates
	any events, mark it as such, so that the declarative debugger doesn't
	expect it to generate any.

	Do the same for the deep profiling builtin module.

tests/debugger/term_size_words.{m,inp,exp}:
tests/debugger/term_size_cells.{m,inp,exp}:
	Two new test cases, each testing one of the new grades.

tests/debugger/Mmakefile:
	Enable the two new test cases in their grades.

	Disable the tests sensitive to stack frame sizes in term size profiling
	grades.

tests/debugger/completion.exp:
	Add the new "term_size" mdb command to the list of command completions,
	and delete "proc_body".

tests/debugger/declarative/dependency.{inp,exp}:
	Use "print proc_body" instead of "proc_body".

tests/hard_coded/nondet_c.m:
tests/hard_coded/pragma_inline.m:
	Use MR_offset_incr_hp instead of MR_incr_hp to ensure that all places
	that allocate cells also allocate space for the term size slot if
	necessary.

tests/valid/Mmakefile:
	Disable the IL tests in term size profiling grades, since the term size
	profiling primitives haven't been (and probably won't be) implemented
	for the MLDS backends, and handle_options causes a compiler abort
	for grades that combine term size profiling and any one of IL, Java
	and high level C.
2003-10-20 07:29:59 +00:00
..