Evaluating the .classes variable in a Mmakefile can be quite slow.
We can speed up mmake invocations by assigning the variable to an
empty value unless we are targeting Java.
On my machine, a call to mmake in an up-to-date compiler directory
now takes about 0.4 seconds, down from 1.2 seconds.
compiler/write_deps_file.m:
As above.
Increase parallelism while making .trans_opt files in the library
directory by deleting more edges in the trans-opt dependency graph.
There are no regressions in the generated .trans_opt files.
On my machine, mmake -j32 trans_opts in the library directory
goes from 23 seconds to approximately 11 seconds.
library/mer_std.trans_opt_deps_spec:
As above.
Let module_allow_deps and module_disallow_deps terms delete an edge
from A to B in the trans-opt dependency graph, even if A and B do not
belong to the same SCC. Previously the edge would only be removed if
A and B imported each other (possibly indirectly), which was
unnecessarily restrictive.
compiler/generate_dep_d_files.m:
As above.
... by using the algorithm now in sparse_bitset.list_to_set.
Keep the old list_to_set algorithm around in both modules for a short while
to allow comparative benchmarking.
library/fat_sparse_bitset.m:
library/sparse_bitset.m:
As above.
tests/hard_coded/speedtest_bitset.m:
A benchmark program to compare the old and new list_to_set algorithms,
both versus each other, and between sparse_bitset and fat_sparse_bitset.
library/test_bitset.m:
Instead of comparing just one bitset module (tree_bitset by default)
against set_ordlist, compare all of them (tree_bitset, sparse_bitset,
and fat_sparse_bitset) against set_ordlist.
tests/hard_coded/test_bitsets.{m,exp}:
Rename this test case from test_tree_bitset to test_bitsets, since
(through library/test_bitset.m) we now test ALL bitset implementations,
not just tree_bitset.
tests/hard_coded/Mmakefile:
Refer to the test by its new name.
The current implementation of the unsigned conversion specifiers (i.e. %x, %X,
%o, %u) for fixed-size 8-, 16- and 32-bit signed integers works by casting
theses values into (word-sized) ints and then using the existing code for
formatting ints as unsigned values. This does not work for negative values as
they are being sign extended when converted to ints. For example,
io.format("%x", [i8(min_int8)], !IO)
prints:
ffffffffffffff80 (on a 64-bit machine)
rather than just:
80
Fix the above problem by casting ints, int8s, int16s etc. that are being
formatted as unsigned values to uints and using the uint formatting code.
NOTE: the formatting code for 64-bit integers follows a different code path
and is not affected by any of this.
library/string.format.m:
Implement unsigned conversion specifiers for non-64-bit signed
integers by casting to uint and using the uint formatting code.
Add predicates for converting signed integers into uints.
The format_unsigned_int_* predicates can be deleted after this change
is installed.
compiler/format_call.m:
Implement unsigned conversion specifiers for non-64-bit signed
integers by casting to uint and using the uint formatting code.
compiler/introduced_call_table.m:
Update the table of introduced predicates.
compiler/options.m:
Add an option that can be used to test whether this fix is
installed.
tests/hard_coded/Mmakefile:
tests/hard_coded/Mercury.options:
tests/hard_coded/opt_format_sign_extend.{m,exp}:
Test that formatting code introduced by the compiler does not
accidentally sign-extend negative values.
tests/string_format/string_format_{o,x,u}.{m,exp,exp2}:
Make these tests much more comprehensive then they previously
were.
Use the recently added --trans-opt-deps-spec option to break cycles in
the trans-opt dependency graph for the standard library. This enables
more parallelism when making the .trans_opt files; it now takes about
half as long as before.
Ordering modules sensibly, so that .trans_opt files are created in a
logical order, also improves analysis results for many predicates and
functions. The only results which show a regression with this change are
for deprecated forwarding predicates/functions.
In future, we will probably be able to trim more dependencies to further
improve parallelism, without impacting analysis results.
configure.ac:
Check that the bootstrap compiler supports --trans-opt-deps-spec.
library/mer_std.trans_opt_deps_spec:
Add the spec file that adjusts dependencies in the trans-opt
dependency graph.
library/INTER_FLAGS:
Use the --trans-opt-deps-spec option when building with mmake.
scripts/prepare_install_dir.in:
tools/binary:
tools/bootcheck:
tools/unary:
Copy mer_std.trans_opt_deps_spec when preparing a copy of the
library directory.
Implement two transitive closure algorithms in the digraph module:
- Basic_TC by Yannis Ioannidis et al.
- STACK_TC by Esko Nuutila, a refinement of the SIMPLE_TC algorithm
previously implemented
On 450 graphs randomly generated by tests/hard_coded/digraph_tc.m,
ranging from 100 to 3000 vertices:
- basic_tc ran from 0.79 to 1.66 times as fast as simple_tc
(mean 1.139, stdev 0.136)
- basic_tc ran from 0.83 to 1.81 times as fast as stack_tc
(mean 1.131, stdev 0.160)
Therefore, after this commit, I will delete the simple_tc and stack_tc
implementations, but they will be available in the version history.
library/digraph.m:
Implement Basic_TC and STACK_TC.
Use map.transform_value in key_set_map_union to replace a search
followed by update.
tests/hard_coded/digraph_tc.m:
Test and benchmark the new algorithms.
Also compare inverse graphs to check that predecessor maps are
maintained properly.
library/sparse_bitset.m:
Replace the old algorithm, which had O(N^2) worst-case behavior,
with a modified form of natural merge sort, whose worst-case complexity
is O(NlogN).
NEWS:
As above.
README.md:
RELEASE_NOTES:
bindist/Mmakefile:
compiler/notes/coding_standards.html:
compiler/notes/developer_intro.html:
Conform to the above change.
compiler/prog_item.m:
We used to record information about include declarations
in parse_tree_int[012] in two forms:
- as a pair of maps from module names to contexts (one each for
includes in the interface and implementation sections), and
- as a single map from module names to an include_module_info, which
recorded the section of its appearance along with its context.
The second of these data structures is derived from the first,
in a process that can result in the generation of diagnostic messages.
In the absence of any issues reported by these diagnostics, the two forms
contain the same information.
Avoid this redundancy by keeping only the second form in the parse trees
of .int0, .int and .int2 files. (.int3 files cannot contain include_module
declarations.)
Since .int2 files may contain include_module declarations only in
the interface section, change the representation of the second form
to a type that expresses this invariant: int_include_module_map,
which is a subtype of the existing type include_module_map.
compiler/comp_unit_interface.m:
compiler/convert_parse_tree.m:
compiler/equiv_type.m:
compiler/get_dependencies.m:
compiler/grab_modules.m:
compiler/make_hlds_separate_items.m:
compiler/module_qual.collect_mq_info.m:
compiler/parse_tree_out.m:
compiler/recompilation.check.m:
compiler/recompilation.version.m:
Conform to the change above.
compiler/item_util.m:
Add a utility predicate for use by new code above.
README.MacOS:
Rename to README.macOS.md.
Fix markdown in a few spots.
Add a missing word.
Describe the OS as "macOS", which is what Apple currently call it.
Fix up some links.
README.md:
Conform to the above change.
compiler/prog_item.m:
We used to record information about import and use declarations
in parse_tree_int[012] in two forms:
- as a quartet of maps from module names to contexts (one each for
int imports, int uses, imp imports and imp uses), and
- as a single map from module names to a section_import_and_or_use,
which recorded the section and kind (import or use) of its appearance
along with its one context, except for the case of modules that have
an use_module declaration in the interface section and an import_module
declaration in the implementation section.
The second of these data structures is derived from the first,
in a process that can result in the generation of diagnostic messages.
In the absence of any issues reported by these diagnostics, the two forms
contain the same information.
Avoid this redundancy by keeping only the second form in the parse trees
of .int0, .int and .int2 files. (For .int3 files, which can contain
only import_modules, and only in the interface section, this redundancy
has not been present even before now.)
Since .int and .int2 files may contain only use_module declarations
and not import_module declarations, change the representation of the
second form to a type that expresses this invariant: the new type
section_use_map, which is a subtype of the existing type
section_import_and_or_use_map.
For .int2 files, we could use an even tighter type right now, but
a fix for Mantis bug #563 would have to undo such a change, so
don't bother.
compiler/comp_unit_interface.m:
Delete the code that used to construct the first form above
for these interface file kinds. Conform to the changes above.
compiler/convert_parse_tree.m:
compiler/equiv_type.m:
compiler/get_dependencies.m:
compiler/grab_modules.m:
compiler/make_hlds_separate_items.m:
compiler/module_qual.collect_mq_info.m:
compiler/parse_tree_out.m:
compiler/recompilation.check.m:
compiler/recompilation.version.m:
Conform to the changes above.
compiler/item_util.m:
Add new, specialized versions of existing utility predicates
to make that conformance possible.
compiler/prog_item.m:
The ptiN_import_use_map fields in the representations of .int0, .int
and .int2 files had the same type as the ptms_import_use_map field
in the parse trees of .m files, which is where they were derived from.
However, while the ptms_import_use_map field needs to be able to represent
implicit imports, the parse trees of .int0, .int and .int2 files
should never include any implicit imports, and in fact any implicit
imports in these fields were already ignored.
Encode the invariant that interface files never include implicit imports
in the types of these fields.
compiler/comp_unit_interface.m:
Discard the implicit part of the source file's import_and_or_use_map
when computing the contents of .int0, .int and .int2 files.
compiler/item_util.m:
Provide the facilities used by the updated code in the modules above.
compiler/convert_parse_tree.m:
compiler/grab_modules.m:
compiler/make_hlds_separate_items.m:
Conform to the changes above.
Convert it to Markdown.
README.CSharp.
Add a .md extension.
Update many of the details in this file.
Add a table-of-contents.
Break up the FAQ into separate named sections.
README.md:
Conform to the above change of name.
library/sparse_bitset.m:
library/fat_sparse_bitset.m:
Speed up the remove_leq and remove_gt operations by moving a
loop invariant computation, the conversion of the boundary item's index
into an <offset,bitposn> pair, out of the loop.
Eliminate some unnecessary differences between the two modules,
e.g. clear_bit being a predicate rather than a function.
library/test_bitset.m:
Add facilities to test the remove_leq and remove_gt operations
of sparse_bitset.m, fat_sparse_bitset.m, and tree_bitset.m
against the same operations on plain old set_ordlists.
Bring this module up to date by requiring set elements to be
members of the uenum typeclass, not the enum typeclass.
Make the test_bitset type a bespoke type.
library/tree_bitset.m:
Add predicate versions of the remove_leq and remove_gt operations
alongside the existing function versions, to allow the new code
in test_bitset.m to work the same way regardless of which bitset module
it is testing.
For uniformity with the other bitset modules, require set elements to be
members of the uenum typeclass, not the enum typeclass.
Change the other integers, such as level numbers, to be unsigned
as well, to avoid the need for casts.
NEWS:
Announce the new additions and changes.
tests/hard_coded/test_tree_bitset.{m,exp}:
Use those new facilities to test those operations, and add some
test sets designed for that purpose.
Add a comment about the limitations of this testing strategy.
tests/hard_coded/bitset_tester.m:
Delete this long-unused module. (It was the original basis of
the test_bitset.m module in the library directory, but it became unused
when test_tree_bitset.m switched to using that module a long time ago.)
library/digraph.m:
Delete digraph.old_tc, digraph.old_rtc and digraph.slow_rtc.
tests/hard_coded/digraph_tc.m:
tests/hard_coded/digraph_tc.exp:
Delete comparisons using old_tc, old_rtc and slow_rtc.
Implement transitive closure using the simple_tc algorithm from
Esko Nuutila's doctoral thesis.
On a sample of graphs randomly generated by tests/hard_coded/digraph_tc.m,
ranging from 100 to 3000 vertices, the simple_tc implementation ran
from 2.33 to 93 times as fast as the old implementation on my machine.
library/digraph.m:
Rename digraph.tc and digraph.rtc to digraph.old_tc and
digraph.old_rtc. They are kept around for benchmarking,
and will be deleted soon.
Use the simple_tc algorithm to implement digraph.tc.
Use digraph.tc to implement digraph.rtc.
Let key_set_map_add call sparse_bitset.insert_new instead of
sparse_bitset.contains followed by sparse_bitset.insert.
tests/hard_coded/digraph_tc.m:
Add code to benchmark the new and old TC implementations.