library/char.m:
Fix description of character ranges recognised by hex_digit_to_int/2.
library/pretty_printer.m:
Fix a copy-and-paste error.
library/string.m:
Fix errors in predicate descriptions.
Fix obsolete pragmas that specified the replacement predicate to
be the target of the pragma.
... when targeting C.
library/string.m:
On 27 Oct 2021, Julien sped up one of these operations
by avoiding the use of sprintf. Apply the same technique,
suitably generalized, to all other similar operations.
Use macros to reduce the amount of code duplication needed.
runtime/mercury_conf.h.in:
Fix an old bug: spell MR_MERCURY_IS_{32,64}_BITS correctly,
to match the name that is (conditionally) defined by configure.
configure.ac:
Delete references to MERCURY_IS_{32,64}_BITS, which turn out
to be totally unused.
configure.ac:
Require the installed compiler to support this new builtin.
library/private_builtin.m:
Declare the builtin, making it usable.
library/string.m:
Use it to replace the old hand-written range check.
compiler/print_help.m:
Replace all occurrences of "e.g." in generated texinfo with
"e.g.@:", which apparently generates better looking output.
doc/user_guide.texi:
Expect the changes.
library/string.m:
Add a new version of an existing predicate to help with the above.
NEWS.md:
Announce the new predicate.
Add the following predicates to find the first or last occurrence of a
code point in a string:
find_first_char
We already had the code to implement contains_char.
Not strictly necessary as we have sub_string_search.
find_first_char_start
Safe wrapper for unsafe_find_first_char_start.
unsafe_find_first_char_start
This is just the body of find_first_char, which should be useful for
users. Not strictly needed as we have sub_string_search_start.
find_last_char
Commonly needed.
NOTE: I also considered these predicates but discarded them for now:
:- pred find_first_char_between(string::in, char::in,
int::in, int::in, int::out) is semidet.
:- pred find_last_char_between(string::in, char::in,
int::in, int::in, int::out) is semidet.
:- pred find_first_match_between(pred(char)::in(pred(in) is semidet),
string::in, int::in, int::in, int::out) is semidet.
:- pred find_last_match_between(pred(char)::in(pred(in) is semidet),
string::in, int::in, int::in, int::out) is semidet.
The _between predicates required a bit more code than I'd like, for the
amount of use that they would (I imagine) get. The _match predicates
were just conveniences over iterating over a string manually.
All four predicates would incur calls to strlen() in C grades,
which suggests adding "unsafe" versions as well.
library/string.m:
Add the predicates above.
Implement string.contains_char using string.find_first_char.
tests/hard_coded/Mmakefile:
tests/hard_coded/string_find_char.exp:
tests/hard_coded/string_find_char.exp2:
tests/hard_coded/string_find_char.m:
Add test case.
NEWS.md:
Announce additions.
Generate "Did you mean ..." messages for unrecognized long options.
compiler/mercury_compile_main.m:
If we have an unrecognized long option or negated long option,
then attempt to generate a "Did you mean ..." message for it.
compiler/error_spec.m:
Add a version of maybe_construct_did_you_mean_pieces that lets
you add a prefix to the strings before including them in the pieces.
This is required for handling the "--" and "--no-" long option
prefixes. Both versions of maybe_construct_did_you_mean pieces
are implemented using the same underlying piece of code.
compiler/options.m:
Add predicates for returning the set of all long options and all negatable
long options. To make this possible, shift the long option table into a
new predicate and add an (out, out) is multi mode for it.
library/getopt.m:
Update the documentation to include maybe_string_special options
among the option types that are negatable.
library/string.m:
Add the function add_prefix/2.
NEWS:
Announce the addition to the string module.
For now, the implementation covers only non-lookup switches.
compiler/builtin_ops.m:
Generalize the existing offset_str_eq binary op by adding an optional
size parameter, which, if present, restricts the equality test to look at
the given number of code units at most.
compiler/llds_out_data.m:
compiler/mlds_to_c_data.m:
Generalize the output of binop rvals whose operation is offset_str_eq.
In llds_out_data.m, fix a bug in the original code. (This bug did not
lead to problems because before this diff, we never generated this op.)
compiler/string_switch_util.m:
Add a predicate that recognizes when a trie node that is NOT a leaf
nevertheless represents the top of a stick, which means that it has
only one possible next code unit, which itself may have only one
possible next code unit, and so on, until we reach a node that *does*
have two or more next code units. (One of those may be the code unit
of the string-ending NULL character.)
compiler/ml_string_switch.m:
Use the new predicate in string_switch_util.m to generate better code
for sticks. Instead of comparing each character in the stick individually
against the relevant code unit of the string being switched on, compare
them all at once using the new binary op.
compiler/ml_switch_gen.m:
Insist on both the host machine and the target machine
using the C backend.
compiler/string_switch.m:
Implement non-lookup trie switches. The code follows the approach used
in ml_string_switch.m as much as possible, but there are plenty of
differences caused by targeting the LLDS.
Rename some predicates to specify which switch implementation method
they belong to.
Write a comment just once, and refer to it from elsewhere instead of
duplicating it at each reference site.
compiler/switch_gen.m:
Enable the use of trie switches when the option values call for it,
and when the switch is not a lookup switch.
compiler/cse_detection.m:
Do not flood the output of mmc -V with messages that have nothing to do
with the module being compiled.
compiler/options.m:
Add a way to specify --no-allow-inlining on the command line.
This can help debug code generator changes like this, by disallowing
a transform that can modify the Mercury code whose compilation process
you are trying to debug. (The documentation of the --inlining option
implies that --no-inlining should do the same job, but it does not.)
The option is not documented for users.
compiler/string_encoding.m:
Provide a version of from_code_unit_list_in_encoding that allows
non-well-formed code unit sequences as input, and provide det versions
of both versions. This is for use by both string_switch.m and
ml_string_switch.m.
compiler/hlds_goal.m:
Document the properties of case_ids.
compiler/llds.m:
Document the possibility that string constants are not well formed.
compiler/bytecode.m:
compiler/code_util.m:
compiler/mlds_dump.m:
compiler/ml_global_data.m:
compiler/mlds_to_cs_data.m:
compiler/mlds_to_java_data.m:
compiler/opt_debug.m:
Conform to the changes above.
library/string.m:
Replace the non-exported test predicate internal_encoding_is_utf8 with
an exported function that returns an enum specifying the string encoding.
NEWS.md:
Announce the new function.
runtime/mercury_string.h:
Add the C macro that implements the new form of the offset_str_eq
binary op.
tests/hard_coded/string_switch4.{m,exp}:
We have long had three copies of the exact same code, in string_switch.m,
string_switch2.m and string_switch3.m, which were compiled with
- no smart switch implementation
- smart switch implementation forced to use the hash table method
- smart switch implementation forced to use binary search method
Add this new copy, which is compiled with
- smart switch implementation forced to use the new trie method
tests/hard_coded/Mmakefile:
Add the new test case.
tests/hard_coded/Mercury.options:
Update the options of the test cases, and specify them for the new.
tests/hard_coded/string_switch.m:
tests/hard_coded/string_switch2.m:
tests/hard_coded/string_switch3.m:
Update the top-of-module comment block to be identical in all four copies
of this module.
library/char.m:
Export a new predicate, char_int_is_surrogate, the duplicates
the job of the MR_is_surrogate macro in the runtime, for use by
the new check in string.m. Add a comment about the code duplication.
The new predicate is not documented for users.
runtime/mercury_string.h:
Add a comment about the code duplication.
library/string.m:
Use the new predicate in char.m to check for surrogates when converting
a code unit list to an utf8 string.
library/string.m:
Use variable names that give more info about their meaning, such as
specify whether the code units they refer to are utf8 or utf16.
Fix unnecessary differences between the variable names used in related
predicates/functions/
Give some predicates more descriptive names.
Add some explicit module qualifications.
library/string.m:
Fix a bug in the word_wrap_separator function, which could cause it
to add unnecessary line breaks. The bug was that it
- insisted all line breaks needed for extremely long words at once, while
- the logic of the code that did that did not consider all the conditions
that the main code path considered.
The fix is to handle adding line breaks to extremely long words
one line break at the time, judging the need for those line breaks
using the main code path.
Improve the documentation of word_wrap and word_wrap_separator.
Replace use of reversed lists with cords where this improves clarity.
Use both predicate names and variable names that directly specify
whether they refer to code units or code points.
Add some XXXs on dodgy code.
For two predicates, inline them at their only call sites.
tests/general/string_test.exp:
Expect fixed output from word_wrap_separator.
library/string.m:
Even though format_table_max is a minor tweak on format_table,
its implementation used to be completely separate. Act on an old XXX
and make format_table use the same primitive ops as format_table_max.
Document the operation of format_table a bit better.
Change the way that format_table_max handles column-width limits,
by accepting overlong column contents *without* starting a new line.
Document the new semantics.
Use predmode decls when possible.
tests/general/string_test.{m,exp}:
Add a test of format_table_max, which previously did not have one.
library/io.m:
Add the predicates
read_named_file_as_string_wf
read_named_file_as_lines_wf
read_file_as_string_wf
read_file_as_string_and_num_code_units_wf
which extend their base predicates (without the _wf suffix) by checking
whether the string read from the file is well formed, and returning
an error if it is not.
library/io.stream_op.m:
Fix and expand a comment.
library/io.text_read.m:
Add a comment.
library/string.m:
Add check_well_formedness, a predicate that checks whether a string
is well formed, and (if relevant) specifies the offset of the first
non-well-formed character.
Make the documentation of index_next and index_next_repl more detailed.
runtime/mercury_string.[ch]:
Define MR_utf8_find_ill_formed_char, a version of MR_utf8_verify
that returns the offset of the first ill-formed UTF-8 char in the
given string, if there is one. This is used in the implementation of
check_well_formedness for C.
NEWS.md:
Announce the new library predicates.
Sort some lists of pred names.
library/string.m:
... in favor of the s/codepoint/code_point/ versions.
NEWS.md:
Mention that these predicates and functions have been marked obsolete.
Mention that all the X_to_doc functions in modules other than
pretty_printer.m have been marked obsolete in favour of the versions
in pretty_printer.m.
Standardize on indenting lists of function and predicate names
by four spaces, not three. (There were more than five times as many
that were indented by four than by three.)
library/string.m:
For each predicate and function whose name includes "codepoint",
- create a version in which "codepoint" is replaced by "code_point",
- make this version the main implementation, making the "codepoint"
versions forward to the "code_point" versions,
- add obsolete pragmas for the "codepoint" versions, though these are
commented out for now. This is so that an installed compiler containing
this change will already have the recommended alternative available
when the commenting-out is removed (maybe in a week or so).
NEWS.md:
Announce the new predicates and functions.
compiler/c_util.m:
compiler/const_prop.m:
compiler/inst_check.m:
compiler/parse_tree_out_term.m:
compiler/structure_reuse.direct.choose_reuse.m:
compiler/write_error_spec.m:
library/pprint.m:
library/pretty_printer.m:
library/string.format.m:
Replace all uses of the "codepoint" versions with the "code_point"
versions.
library/array.m:
library/char.m:
library/float.m:
library/int.m:
library/int16.m:
library/int32.m:
library/int64.m:
library/int8.m:
library/list.m:
library/one_or_more.m:
library/string.m:
library/tree234.m:
library/uint.m:
library/uint16.m:
library/uint32.m:
library/uint64.m:
library/uint8.m:
library/version_array.m:
Mark the X_to_doc function in each of these modules as obsolete,
and make it a forwarding function to the actual implementation
in pretty_printer.m. The intention is that when these forwarding
functions are eventually removed, this will also remove the dependency
of these modules on pretty_printer.m. This should help at least some
of these modules escape the giant SCC in the library's dependency graph.
(It does not make sense that a library module that adds code to increment
an int thereby becomes dependent on pretty_printer.m through int.m.)
library/pretty_printer.m:
Move all the X_to_doc functions from the above modules here.
Fix the one_or_more_to_doc function, which was
- missing the comma between the two arguments of the one_or_more
function symbol, and
- would print "..., ...]" instead of just "...]" at the end of the
tail list when that list exceeded the limits of the specified pp_params.
Rename one of the moved types along with its function symbols,
to reduce ambiguity.
Put arrays before their indexes in the argument lists of some of
the moved functions.
Some of the moved X_to_doc functions for compound types returned
a doc that had an indent wrapper. These indents differed between the
various X_to_doc functions without any visible reason, but they are
also redundant. The callers can trivially add such wrappers if they
want to, but taking them off, if they want them off, is harder.
Eliminate the problem by deleting all such indent wrappers.
Add formatters for the intN, uintN and one_or_more types to the
default formatter map. Their previous absence was an oversight.
Add a function, get_formatter_map_entry_types, that returns the ids
of the types in the formatter_map given to the function. It is intended
for tests/hard_coded/test_pretty_printer_defaults.m, but is exported
for anyone to use.
tests/hard_coded/test_pretty_printer_defaults.{m,exp}:
Use get_formatter_map_entry_types to print the default formatter map
in a format that is much more easily readable.
NEWS:
Announce all the user-visible changes above.
Mercury inherited its original system of operator priorities from Prolog,
because during its initial development, we wanted to be able execute
the Mercury compiler using NU-Prolog and later SICStus Prolog.
That consideration has long been obsolete, and now we may fix the
design error that gifted Prolog with its counter-intuitive system
of operator priorities, in which higher *numerical* operator priorities
mean lower *actual* priorities. This diff does that.
library/ops.m:
Change the meaning of operator priorities, to make higher numerical
priorities mean also higher actual priorities.
This semantic change requires corresponding changes in any other module
that uses ops.m. To force this change to occur, change the type
representing priorities from being a synonym for a bare int to being
a notag wrapper around an uint.
The old "assoc" type had a misleading name, since it is related to
associativity but is not itself a representation of associativity.
Its two function symbols, which used to be just "x" and "y", meant that
the priority of an argument must be (respectively) greater than,
or greater than equal to, the priority of the operator. So rename
x to arg_gt, y to arg_ge, and assoc to arg_prio_gt_or_ge.
Rename the old adjust_priority_for_assoc predicate to min_priority_for_arg,
which better expresses its semantics. Turn it into a function, since
some of its users want it that way, and move its declaration to the
public part of the interface.
Add a method named tightest_op_priority to replace the use of 0
as a priority.
Rename the max_priority method as the loosest_op_priority method.
Add a method named universal_priority to replace the use of
max_priority + 1 as a priority.
Add a function to return the priority of the comma operator,
to allow other modules to stop hardcoding it.
Add operations for comparing priorities and for incrementing/decrementing
priorities.
Change the prefix on the names of the predicates that take op_infos
as inputs from "mercury_op_table_" to "op_infos_", since the old prefix
was misleading.
Add a note on an significant old problem with an exported type synonym.
library/mercury_term_parser.m:
Conform to the changes above.
Delete unnecessary module qualifiers, since they were just clutter.
Add "XXX OPS" to note further opportunities for improvement.
library/pprint.m:
Conform to the changes above.
Rename a function to avoid ambiguity.
library/pretty_printer.m:
library/stream.string_writer.m:
library/string.to_string.m:
library/term_io.m:
Conform to the changes above.
library/string.m:
Add a note on an significant old problem.
NEWS:
Announce the user-visible changes.
tests/hard_coded/bug383.m:
Update this test case to use the new system of operator priorities.
tests/hard_coded/term_io_test.{m,inp}:
Fix white space.
extras/old_library_modules/old_mercury_term_parser.m:
extras/old_library_modules/old_ops.m:
The old contents of the mercury_term_parser and ops modules,
in case anyone wants to continue using them instead of updating
their code to use their updated equivalents.
samples/calculator2.m:
Import the old versions of mercury_term_parser and ops.
configure.ac:
Require the installed compiler to support disable_warning scopes
for unknown_format_calls.
compiler/Mercury.options:
library/Mercury.options:
Do not disable unknown_format_call warnings in whole files.
compiler/parse_tree_out_info.m:
compiler/pd_debug.m:
library/io.m:
library/stream.string_writer.m:
library/string.m:
Disable unknown_format_call warnings for just the format calls
that need it.
library/string.m:
Fix string.all_match to fail if the string being tested contains
any ill-formed code unit sequences.
Fix the Mercury implementation of string.contains_char to continue
searching for the character past any ill-formed code unit sequences.
The foreign code implementations already did so.
Fix unsafe_index_next_repl and unsafe_prev_index_repl in C grades.
We indexed the C string with `ReplacedCodeUnit = Str[Index]' but
since Mercury strings have the C type `char *', ReplacedCodeUnit
could take on a negative value. When ReplacedCodeUnit == -1,
the caller would assume there is a valid encoding of a code point
beginning at Index, and therefore return `not_replaced' instead of
`replaced_code_unit(255)'.
Add casts to `unsigned char' in other places where we index C
strings.
Clarify documentation.
Document that string.duplicate_char may throw an exception.
tests/hard_coded/string_all_match.m:
tests/hard_coded/string_all_match.exp:
tests/hard_coded/string_all_match.exp2:
Add test for string.all_match.
tests/hard_coded/contains_char_2.m:
tests/hard_coded/contains_char_2.exp:
Delete older test case for string.contains_char.
tests/hard_coded/string_contains_char.m:
tests/hard_coded/string_contains_char.exp:
tests/hard_coded/string_contains_char.exp2:
Add better test for string.contains_char.
tests/hard_coded/Mmakefile:
Enable the tests.
NEWS:
Announce the changes.
Add a new predicate that tests if string contains any characters that succeed
for a given test predicate.
library/string.m:
Add the new predicate.
compiler/options.m:
Replace the predicate string_contains_whitespace/1 defined here
with a call to the new library predicate.
NEWS:
Announce the new predicate.
tests/hard_coded/Mmakefile:
tests/hard_coded/string_contains_match.{m,exp}:
Add a test of the new predicate.
compiler/add_clause.m:
Generate a warning for mode-specific clauses when the clause is for
a predicate that has only one mode, provided that the warning is enabled.
compiler/options.m:
Add an option to enable this warning.
doc/user_guide.texi:
Document this option.
library/exception.m:
library/int.m:
library/rtti_implementation.m:
library/string.m:
Delete modes from clause heads that would get this warning.
tests/valid/spurious_purity_warning.m:
Delete modes from clause heads that would get this warning.
Do not interleave predicate definitions.
tests/warnings/unneeded_mode_specific_clause.{m,exp}:
A test case for this warning.
tests/warnings/Mmakefile:
Enable the new test case.
tests/invalid/multimode_syntax.err_exp:
Expect the new warning.
library/string.m:
Delete procedures that have been marked as obsolete since 2019.
NEWS:
Announce the deletions.
tests/hard_coded/string_append_ooi.m:
tests/hard_coded/string_append_ooi_ilseq.m:
tests/hard_coded/string_presuffix.{m,exp}:
Conform to the above changes.
In the Mercury standard library, every exported predicate or function
has (or at least *should* have) a comment that documents it, including
the meanings of its arguments. About 35-40% of these modules put `'s
(left and right quotes) around the names of the variable representing
those arguments. Some tried to do it consistently (though there were spots
with unquoted or half quoted names), while some did it only a few places.
This is inconsistent: we should either do it everywhere, or nowhere.
This diff makes it nowhere, because
- this is what the majority of the standard library modules do;
- this is what virtually all of the modules in the compiler, profiler,
deep_profiler etc directories do;
- typing all those quotes when adding new predicates in modules that
follow this convention is a pain in the ass; and because
- on many modern terminals, `' looks non-symmetrical and weird.
Likewise, the comment explaining a predicate often started with
% `predname(arguments)' returns ...
This diff deletes these quotes as well, since they add nothing useful.
This diff does leave in place quotes around code fragments, both terms
and goals, where this helps delineate the boundaries of that fragment.