Benchmarking on randomly generated digraphs shows a speedup of 30-40%.
The time to make dependencies in the compiler directory on my machine
is reduced from 3.65 seconds to 2.40 seconds.
library/digraph.m:
Rewrite digraph.compose more efficiently.
Implement two transitive closure algorithms in the digraph module:
- Basic_TC by Yannis Ioannidis et al.
- STACK_TC by Esko Nuutila, a refinement of the SIMPLE_TC algorithm
previously implemented
On 450 graphs randomly generated by tests/hard_coded/digraph_tc.m,
ranging from 100 to 3000 vertices:
- basic_tc ran from 0.79 to 1.66 times as fast as simple_tc
(mean 1.139, stdev 0.136)
- basic_tc ran from 0.83 to 1.81 times as fast as stack_tc
(mean 1.131, stdev 0.160)
Therefore, after this commit, I will delete the simple_tc and stack_tc
implementations, but they will be available in the version history.
library/digraph.m:
Implement Basic_TC and STACK_TC.
Use map.transform_value in key_set_map_union to replace a search
followed by update.
tests/hard_coded/digraph_tc.m:
Test and benchmark the new algorithms.
Also compare inverse graphs to check that predecessor maps are
maintained properly.
library/digraph.m:
Delete digraph.old_tc, digraph.old_rtc and digraph.slow_rtc.
tests/hard_coded/digraph_tc.m:
tests/hard_coded/digraph_tc.exp:
Delete comparisons using old_tc, old_rtc and slow_rtc.
Implement transitive closure using the simple_tc algorithm from
Esko Nuutila's doctoral thesis.
On a sample of graphs randomly generated by tests/hard_coded/digraph_tc.m,
ranging from 100 to 3000 vertices, the simple_tc implementation ran
from 2.33 to 93 times as fast as the old implementation on my machine.
library/digraph.m:
Rename digraph.tc and digraph.rtc to digraph.old_tc and
digraph.old_rtc. They are kept around for benchmarking,
and will be deleted soon.
Use the simple_tc algorithm to implement digraph.tc.
Use digraph.tc to implement digraph.rtc.
Let key_set_map_add call sparse_bitset.insert_new instead of
sparse_bitset.contains followed by sparse_bitset.insert.
tests/hard_coded/digraph_tc.m:
Add code to benchmark the new and old TC implementations.
The implementation of digraph.rtc was incorrect (as demonstrated in the
new test case), which meant that digraph.tc was also incorrect.
library/digraph.m:
Fix the implementation of rtc (reflexive transitive closure):
- Following the algorithm used in digraph.cliques, it needs to
traverse the graph G in *reverse* depth-first order.
- To find the clique containing a vertex X, it needs to do a DFS on
the *reversed* graph to find the vertices with a path to X.
The vertices that were previously unvisited will be members of
the same clique as X.
- Previously it found the "followers" of the elements of the clique,
and the followers of those followers, then added edges from the
members of the current clique to those followers. However, that
only includes vertices two steps removed from the clique.
I have fixed it to add edges to *all* vertices reachable from
members of the clique.
Add straightforward implementations of tc and rtc for comparison.
Add some comments.
tests/hard_coded/Mmakefile:
tests/hard_coded/digraph_tc.exp:
tests/hard_coded/digraph_tc.inp:
tests/hard_coded/digraph_tc.m:
Add test case.
NEWS:
Announce the fixes.
NEWS:
Mention all the user-visible changes below.
library/enum.m:
Add the typeclass uenum, which is a version of the existing enum typeclass
that maps items to uints, not ints. It also uses a semidet predicate,
not a semidet function, to get back to the item from the uint.
library/sparse_bitset.m:
library/fat_sparse_bitset.m:
Make these modules operate on uints, which means requiring the items
in the sets to be instances of uenum, not enum.
If a few places, improve loops by doing previously-repeated conversions
of [u]ints into <offset, bit-to-set> pairs just once.
library/counter.m:
Define ucounters, which allocate uints. Improve documentation.
library/digraph.m:
Change digraph_keys from ints to uints, since we put them into
sparse_bitsets.
library/int.m:
Make int an instance of the uenum typeclass. This can help users
who currently put ints into sparse_bitsets.
library/pprint.m:
Prettyprint sparse_bitsets as lists of uints.
library/term.m:
Make vars instances of uenum as well as enum.
library/uint.m:
Make uint an instance of the uenum typeclass.
Add the ubits_per_uint function, which allows some casts to be avoided.
compiler/make.deps_set.m:
Change the indexes we put into sparse_bitsets from ints to uints.
compiler/make.make_info.m:
Change the source of those indexes from ints to uints.
compiler/make.top_level.m:
compiler/make.util.m:
Conform to the changes above.
compiler/pre_quantification.m:
Change zones from ints to uints, since we put them into sparse_bitsets.
tests/hard_coded/int_uenum.{m,exp}:
tests/hard_coded/Mmakefile:
Enable the new test case.
tests/valid/use_import_only_for_instance.m:
Update this extract from library/digraph.m the same way as
library/digraph.m itself.
Fix a mistake in the comment on an exported predicate.
Improve the wording in other comments.
Use explicit module qualification in bunch of places, because without them
references to predicate names that occur in many standard library modules
(that digraph.h imports) are ambiguous to human readers, though not to
the compiler.
Replace some functions with state-variable-friendly predicates.
library/diet.m:
library/fat_sparse_bitset.m:
library/sparse_bitset.m:
library/test_bitset.m:
library/tree_bitset.m:
As above. Also, mark as obsolete the same predicates as were marked obsolete
in the other set modules recently.
compiler/mode_robdd.equiv_vars.m:
compiler/mode_robdd.implications.m:
compiler/mode_robdd.tfeirn.m:
library/robdd.m:
library/digraph.m:
Avoid using predicates that are now marked obsolete.
Discussion of these changes can be found on the Mercury developers
mailing list archives from June 2018.
COPYING.LIB:
Add a special linking exception to the LGPL.
*:
Update references to COPYING.LIB.
Clean up some minor errors that have accumulated in copyright
messages.
library/assoc_list.m:
library/bag.m:
library/bimap.m:
library/calendar.m:
library/char.m:
library/digraph.m:
library/list.m:
library/map.m:
library/multi_map.m:
library/psqueue.m:
library/rbtree.m:
library/string.m:
library/term.m:
library/tree234.m:
library/type_desc.m:
library/univ.m:
library/varset.m:
Replace most occurrences of "abort" with "throw an exception".
Slightly improve the documentation for map.search, map.lookup,
map.inverse_search.
library/deconstruct.m:
Replace "abort" with "runtime abort" where that is meant.
library/digraph.m:
Add the predicates
- return_vertices_in_from_to_order
- return_vertices_in_to_from_order
as versions of the existing tsort predicate that specify the order
of the returned vertices.
Add the functions
- return_sccs_in_from_to_order
- return_sccs_in_to_from_order
as versions of the existing atsort function that specify the order
of the returned SCCs.
NEWS:
Announce the new predicates and functions
This is to make the data type follow the inherent semantics of SCCs
more closely, and enforce the invariant that a procedure can appear
in the SCC only once.
Also, rename the list of SCCs from "dependency_ordering", which does
not give a clue about *which way* the SCCs are ordered, to "bottom_up_sccs",
which does.
compiler/dependency_graph.m:
Make the changes described above.
Document why we reverse the list generated by digraph.atsort.
library/digraph.m:
Document the order in which digraph.atsort returns the list of SCCs.
Note that the last step of atsort is to reverse the list, which
its caller in compiler/dependency_graph.m will then immediately
re-reverse.
Document the order in which digraph.tsort and digraph.dfs return
a list of items.
Give some variables more meaningful names, and make the argument order
of some predicates conform to our conventions.
compiler/hlds_out_module.m:
Add code to print out the dependency info in the module_info, if asked.
doc/user_guide.texi:
Document the dump string option that asks for this.
compiler/hlds_dependency_graph.m:
Make the same changes for hlds_dependency_info as dependency_graph.m
did to just plain dependency_info.
compiler/hlds_pred.m:
Make the scc type expand to a set, not a list, of pred_proc_ids.
compiler/dep_par_conj.m:
compiler/stratify.m:
Conform to the changes above, and simplify some code.
compiler/closure_analysis.m:
compiler/ctgc.util.m:
compiler/deep_profiling.m:
compiler/deforest.m:
compiler/exception_analysis.m:
compiler/goal_util.m:
compiler/granularity.m:
compiler/inlining.m:
compiler/lco.m:
compiler/mercury_compile_llds_back_end.m:
compiler/mode_constraints.m:
compiler/rbmm.interproc_region_lifetime.m:
compiler/rbmm.points_to_analysis.m:
compiler/structure_reuse.indirect.m:
compiler/structure_sharing.analysis.m:
compiler/tabling_analysis.m:
compiler/term_constr_build.m:
compiler/term_constr_data.m:
compiler/term_constr_errors.m:
compiler/term_constr_fixpoint.m:
compiler/term_constr_main.m:
compiler/term_constr_pass2.m:
compiler/term_constr_util.m:
compiler/term_errors.m:
compiler/term_pass1.m:
compiler/term_pass2.m:
compiler/term_util.m:
compiler/termination.m:
compiler/trailing_analysis.m:
compiler/tupling.m:
Conform to the changes above.
NOTE: this change does not affect the io module -- I've left that for a
separate change.
library/*.m:
As per the recent change to the coding standard, avoid module
qualification in library interfaces where possible.
Reformat declarations and descriptive comments to better utilise
any space freed up by the above.
Branches: main
Change the argument order of predicates in the sparse_bitset modules to make
it more conducive to the use of state variable notation.
Group function clauses together with the clauses for the corresponding
predicates.
library/sparse_bitset.m:
As above.
library/digraph.m:
compiler/make.dependencies.m:
compiler/mode_robdd.equiv_vars.m:
compiler/mode_robdd.implications.m:
compiler/mode_robdd.tfeirn.m:
tests/hard_coded/bitset_tester.m:
tests/hard_coded/pprint_test.m:
tests/valid/loop_inv_bug.m:
Conform to the above change.
library/digraph.m:
library/getopt.m:
library/getopt_io.m:
library/map.m:
Remove dependencies on the svset module.
NEWS:
Announce the above change.
Branches: main
Change the argument order of many of the predicates in the map, bimap, and
multi_map modules so they are more conducive to the use of state variable
notation, i.e. make the order the same as in the sv* modules.
Prepare for the deprecation of the sv{bimap,map,multi_map} modules by
removing their use throughout the system.
library/bimap.m:
library/map.m:
library/multi_map.m:
As above.
NEWS:
Announce the change.
Separate out the "highlights" from the "detailed listing" for
the post-11.01 NEWS.
Reorganise the announcement of the Unicode support.
benchmarks/*/*.m:
browser/*.m:
compiler/*.m:
deep_profiler/*.m:
extras/*/*.m:
mdbcomp/*.m:
profiler/*.m:
tests/*/*.m:
ssdb/*.m:
samples/*/*.m
slice/*.m:
Conform to the above change.
Remove any dependencies on the sv{bimap,map,multi_map} modules.
Estimated hours taken: 80
Branches: main
The existing representation of goal_paths is suboptimal for several reasons.
- Sometimes we need forward goal paths (e.g. to look up goals), and sometimes
we need reverse goal paths (e.g. when computing goal paths in the first
place). We had two types for them, but
- their names, goal_path and goal_path_consable, were not expressive, and
- we could store only one of them in goal_infos.
- Testing whether goal A is a subgoal of goal B is quite error-prone using
either form of goal paths.
- Using a goal path as a key in a map, which several compiler passes want to
do, requires lots of expensive comparisons.
This diff replaces most uses of goal paths with goal ids. A goal id is an
integer, so it can be used as a key in faster maps, or even in arrays.
Every goal in the body of a procedure gets its id allocated in a depth first
search. Since we process each goal before we dive into is descendants,
the goal representing the whole body of a procedure always gets goal id 0.
The depth first traversal also builds up a map (the containing goal map)
that tells us the parent goal of ever subgoal, with the obvious exception
of the root goal itself. From the containing goal map, one can compute
both reverse and forward goal paths. It can also serve as the basis of an
efficient test of whether the goal identified by goal id A is an ancestor
of another goal identified by goal id B. We don't yet use this test,
but I expect we will in the future.
mdbcomp/program_representation.m:
Add the goal_id type.
Replace the existing goal_path and goal_path_consable types
with two new types, forward_goal_path and reverse_goal_path.
Since these now have wrappers around the list of goal path steps
that identify each kind of goal path, it is now ok to expose their
representations. This makes several compiler passes easier to code.
Update the set of operations on goal paths to work on the new data
structures.
Add a couple of step types to represent lambdas and try goals.
Their omission prior to this would have been a bug for constraint-based
mode analysis, or any other compiler pass prior to the expansion out
of lambda and try goals that wanted to use goal paths to identify
subgoals.
browser/declarative_tree.m:
mdbcomp/rtti_access.m:
mdbcomp/slice_and_dice.m:
mdbcomp/trace_counts.m:
slice/mcov.m:
deep_profiler/*.m:
Conform to the changes in goal path representation.
compiler/hlds_goal:
Replace the goal_path field with a goal_id field in the goal_info,
indicating that from now on, this should be used to identify goals.
Keep a reverse_goal_path field in the goal_info for use by RBMM and
CTGC. Those analyses were too hard to convert to using goal_ids,
especially since RBMM uses goal_paths to identify goals in multi-pass
algorithms that should be one-pass and should not NEED to identify
any goals for later processing.
compiler/goal_path:
Add predicates to fill in goal_ids, and update the predicates
filling in the now deprecated reverse goal path fields.
Add the operations needed by the rest of the compiler
on goal ids and containing goal maps.
Remove the option to set goal paths using "mode equivalent steps".
Constraint based mode analysis now uses goal ids, and can now
do its own equivalent optimization quite simply.
Move the goal_path module from the check_hlds package to the hlds
package.
compiler/*.m:
Conform to the changes in goal path representation.
Most modules now use goal_ids to identify goals, and use a containing
goal map to convert the goal ids to goal paths when needed.
However, the ctgc and rbmm modules still use (reverse) goal paths.
library/digraph.m:
library/group.m:
library/injection.m:
library/pprint.m:
library/pretty_printer.m:
library/term_to_xml.m:
Minor style improvements.
Estimated hours taken: 12
Branches: main
library/digraph.m:
New module for directed graphs. This is essentially the relation
module but with more consistent terminology, and with argument
ordering that suits state variables. Other differences with the
relation module:
- The digraph_key type has a phantom type parameter, which helps to
ensure that keys from one digraph are not used with another digraph.
- Exports a version of digraph.reduced which also returns the mapping
between the original digraph keys and the new ones.
- The implementation of compose/3 doesn't try to use the "domain" and
"range" of the graphs (which is meaningless in the relation module
anyway).
- New, more efficient algorithm for is_dag/1. Correctness proof is
documented.
- components/2 uses a more efficient data representation, and avoids
some intermediate data structures.
- reduced/{2,3} avoids some intermediate data structures.
- tc/2 avoids some intermediate data structures.
library/library.m:
Add the new module.
library/relation.m:
Document that this module is deprecated in favour of digraph.
Flag relation.init/{0,1} as obsolete (it would be better to flag
the entire module, or the relation/1 type as obsolete, but Mercury
does not support this).
NEWS:
Mention that the new module supersedes relation.m and svrelation.m.
compiler/*.m:
profiler/*.m:
Use the digraph module rather than the relation module.