Commit Graph

496 Commits

Author SHA1 Message Date
Zoltan Somogyi
daba30c48e Stop using higher order insts as modes ...
... in a third batch of library modules.
2023-07-26 22:26:51 +02:00
Zoltan Somogyi
acf55e9631 Fix a comment. 2023-07-20 17:04:48 +02:00
Peter Wang
025472eb02 Fix spelling. 2023-05-29 11:03:39 +10:00
Zoltan Somogyi
d769b04a96 Base string.format_table{,_max} on common code.
library/string.m:
    Even though format_table_max is a minor tweak on format_table,
    its implementation used to be completely separate. Act on an old XXX
    and make format_table use the same primitive ops as format_table_max.

    Document the operation of format_table a bit better.

    Change the way that format_table_max handles column-width limits,
    by accepting overlong column contents *without* starting a new line.
    Document the new semantics.

    Use predmode decls when possible.

tests/general/string_test.{m,exp}:
    Add a test of format_table_max, which previously did not have one.
2023-05-22 19:23:37 +10:00
Zoltan Somogyi
556d9a4fea Improve variable names. 2023-05-02 01:32:22 +10:00
Julien Fischer
199bd81ee6 Fix standard library compilation the C# grade.
library/string.m:
    As above.
2023-04-21 23:41:46 +10:00
Zoltan Somogyi
2a12732260 Improve well-formedness checks on UTF strings.
library/io.m:
    Add the predicates

        read_named_file_as_string_wf
        read_named_file_as_lines_wf
        read_file_as_string_wf
        read_file_as_string_and_num_code_units_wf

    which extend their base predicates (without the _wf suffix) by checking
    whether the string read from the file is well formed, and returning
    an error if it is not.

library/io.stream_op.m:
    Fix and expand a comment.

library/io.text_read.m:
    Add a comment.

library/string.m:
    Add check_well_formedness, a predicate that checks whether a string
    is well formed, and (if relevant) specifies the offset of the first
    non-well-formed character.

    Make the documentation of index_next and index_next_repl more detailed.

runtime/mercury_string.[ch]:
    Define MR_utf8_find_ill_formed_char, a version of MR_utf8_verify
    that returns the offset of the first ill-formed UTF-8 char in the
    given string, if there is one. This is used in the implementation of
    check_well_formedness for C.

NEWS.md:
    Announce the new library predicates.

    Sort some lists of pred names.
2023-04-21 15:55:23 +10:00
Zoltan Somogyi
278ebc172a Mark string.*codepoint* as obsolete ...
library/string.m:
    ... in favor of the s/codepoint/code_point/ versions.

NEWS.md:
    Mention that these predicates and functions have been marked obsolete.

    Mention that all the X_to_doc functions in modules other than
    pretty_printer.m have been marked obsolete in favour of the versions
    in pretty_printer.m.

    Standardize on indenting lists of function and predicate names
    by four spaces, not three. (There were more than five times as many
    that were indented by four than by three.)
2023-02-08 19:51:49 +11:00
Zoltan Somogyi
726da4f03c Prepare to s/codepoint/code_point/ in string.m.
library/string.m:
    For each predicate and function whose name includes "codepoint",

    - create a version in which "codepoint" is replaced by "code_point",
    - make this version the main implementation, making the "codepoint"
      versions forward to the "code_point" versions,
    - add obsolete pragmas for the "codepoint" versions, though these are
      commented out for now. This is so that an installed compiler containing
      this change will already have the recommended alternative available
      when the commenting-out is removed (maybe in a week or so).

NEWS.md:
    Announce the new predicates and functions.

compiler/c_util.m:
compiler/const_prop.m:
compiler/inst_check.m:
compiler/parse_tree_out_term.m:
compiler/structure_reuse.direct.choose_reuse.m:
compiler/write_error_spec.m:
library/pprint.m:
library/pretty_printer.m:
library/string.format.m:
    Replace all uses of the "codepoint" versions with the "code_point"
    versions.
2023-02-02 19:59:10 +11:00
Zoltan Somogyi
5cbcfaa0ed Move X_to_doc functions to pretty_printer.m.
library/array.m:
library/char.m:
library/float.m:
library/int.m:
library/int16.m:
library/int32.m:
library/int64.m:
library/int8.m:
library/list.m:
library/one_or_more.m:
library/string.m:
library/tree234.m:
library/uint.m:
library/uint16.m:
library/uint32.m:
library/uint64.m:
library/uint8.m:
library/version_array.m:
    Mark the X_to_doc function in each of these modules as obsolete,
    and make it a forwarding function to the actual implementation
    in pretty_printer.m. The intention is that when these forwarding
    functions are eventually removed, this will also remove the dependency
    of these modules on pretty_printer.m. This should help at least some
    of these modules escape the giant SCC in the library's dependency graph.
    (It does not make sense that a library module that adds code to increment
    an int thereby becomes dependent on pretty_printer.m through int.m.)

library/pretty_printer.m:
    Move all the X_to_doc functions from the above modules here.

    Fix the one_or_more_to_doc function, which was

    - missing the comma between the two arguments of the one_or_more
      function symbol, and

    - would print "..., ...]" instead of just "...]" at the end of the
      tail list when that list exceeded the limits of the specified pp_params.

    Rename one of the moved types along with its function symbols,
    to reduce ambiguity.

    Put arrays before their indexes in the argument lists of some of
    the moved functions.

    Some of the moved X_to_doc functions for compound types returned
    a doc that had an indent wrapper. These indents differed between the
    various X_to_doc functions without any visible reason, but they are
    also redundant. The callers can trivially add such wrappers if they
    want to, but taking them off, if they want them off, is harder.
    Eliminate the problem by deleting all such indent wrappers.

    Add formatters for the intN, uintN and one_or_more types to the
    default formatter map. Their previous absence was an oversight.

    Add a function, get_formatter_map_entry_types, that returns the ids
    of the types in the formatter_map given to the function. It is intended
    for tests/hard_coded/test_pretty_printer_defaults.m, but is exported
    for anyone to use.

tests/hard_coded/test_pretty_printer_defaults.{m,exp}:
    Use get_formatter_map_entry_types to print the default formatter map
    in a format that is much more easily readable.

NEWS:
    Announce all the user-visible changes above.
2022-12-27 18:27:52 +11:00
Zoltan Somogyi
de75b98b18 Make higher operator priorities bind tighter.
Mercury inherited its original system of operator priorities from Prolog,
because during its initial development, we wanted to be able execute
the Mercury compiler using NU-Prolog and later SICStus Prolog.
That consideration has long been obsolete, and now we may fix the
design error that gifted Prolog with its counter-intuitive system
of operator priorities, in which higher *numerical* operator priorities
mean lower *actual* priorities. This diff does that.

library/ops.m:
    Change the meaning of operator priorities, to make higher numerical
    priorities mean also higher actual priorities.

    This semantic change requires corresponding changes in any other module
    that uses ops.m. To force this change to occur, change the type
    representing priorities from being a synonym for a bare int to being
    a notag wrapper around an uint.

    The old "assoc" type had a misleading name, since it is related to
    associativity but is not itself a representation of associativity.
    Its two function symbols, which used to be just "x" and "y", meant that
    the priority of an argument must be (respectively) greater than,
    or greater than equal to, the priority of the operator. So rename
    x to arg_gt, y to arg_ge, and assoc to arg_prio_gt_or_ge.

    Rename the old adjust_priority_for_assoc predicate to min_priority_for_arg,
    which better expresses its semantics. Turn it into a function, since
    some of its users want it that way, and move its declaration to the
    public part of the interface.

    Add a method named tightest_op_priority to replace the use of 0
    as a priority.

    Rename the max_priority method as the loosest_op_priority method.

    Add a method named universal_priority to replace the use of
    max_priority + 1 as a priority.

    Add a function to return the priority of the comma operator,
    to allow other modules to stop hardcoding it.

    Add operations for comparing priorities and for incrementing/decrementing
    priorities.

    Change the prefix on the names of the predicates that take op_infos
    as inputs from "mercury_op_table_" to "op_infos_", since the old prefix
    was misleading.

    Add a note on an significant old problem with an exported type synonym.

library/mercury_term_parser.m:
    Conform to the changes above.

    Delete unnecessary module qualifiers, since they were just clutter.

    Add "XXX OPS" to note further opportunities for improvement.

library/pprint.m:
    Conform to the changes above.

    Rename a function to avoid ambiguity.

library/pretty_printer.m:
library/stream.string_writer.m:
library/string.to_string.m:
library/term_io.m:
    Conform to the changes above.

library/string.m:
    Add a note on an significant old problem.

NEWS:
    Announce the user-visible changes.

tests/hard_coded/bug383.m:
    Update this test case to use the new system of operator priorities.

tests/hard_coded/term_io_test.{m,inp}:
    Fix white space.

extras/old_library_modules/old_mercury_term_parser.m:
extras/old_library_modules/old_ops.m:
    The old contents of the mercury_term_parser and ops modules,
    in case anyone wants to continue using them instead of updating
    their code to use their updated equivalents.

samples/calculator2.m:
    Import the old versions of mercury_term_parser and ops.
2022-11-11 00:11:44 +11:00
Zoltan Somogyi
5b2f6e533b Disable unknown_format_call warnings using scopes.
configure.ac:
    Require the installed compiler to support disable_warning scopes
    for unknown_format_calls.

compiler/Mercury.options:
library/Mercury.options:
    Do not disable unknown_format_call warnings in whole files.

compiler/parse_tree_out_info.m:
compiler/pd_debug.m:
library/io.m:
library/stream.string_writer.m:
library/string.m:
    Disable unknown_format_call warnings for just the format calls
    that need it.
2022-09-07 12:18:19 +10:00
Zoltan Somogyi
de7a6b8e71 Fix typo in comment. 2022-07-31 11:20:06 +10:00
Peter Wang
2ff6b119ef Fix some handling of ill-formed sequences in string module.
library/string.m:
    Fix string.all_match to fail if the string being tested contains
    any ill-formed code unit sequences.

    Fix the Mercury implementation of string.contains_char to continue
    searching for the character past any ill-formed code unit sequences.
    The foreign code implementations already did so.

    Fix unsafe_index_next_repl and unsafe_prev_index_repl in C grades.
    We indexed the C string with `ReplacedCodeUnit = Str[Index]' but
    since Mercury strings have the C type `char *', ReplacedCodeUnit
    could take on a negative value. When ReplacedCodeUnit == -1,
    the caller would assume there is a valid encoding of a code point
    beginning at Index, and therefore return `not_replaced' instead of
    `replaced_code_unit(255)'.

    Add casts to `unsigned char' in other places where we index C
    strings.

    Clarify documentation.

    Document that string.duplicate_char may throw an exception.

tests/hard_coded/string_all_match.m:
tests/hard_coded/string_all_match.exp:
tests/hard_coded/string_all_match.exp2:
    Add test for string.all_match.

tests/hard_coded/contains_char_2.m:
tests/hard_coded/contains_char_2.exp:
    Delete older test case for string.contains_char.

tests/hard_coded/string_contains_char.m:
tests/hard_coded/string_contains_char.exp:
tests/hard_coded/string_contains_char.exp2:
    Add better test for string.contains_char.

tests/hard_coded/Mmakefile:
    Enable the tests.

NEWS:
    Announce the changes.
2022-07-27 14:56:49 +10:00
Julien Fischer
ae0525af53 Add string.contains_match/2.
Add a new predicate that tests if string contains any characters that succeed
for a given test predicate.

library/string.m:
    Add the new predicate.

compiler/options.m:
    Replace the predicate string_contains_whitespace/1 defined here
    with a call to the new library predicate.

NEWS:
    Announce the new predicate.

tests/hard_coded/Mmakefile:
tests/hard_coded/string_contains_match.{m,exp}:
    Add a test of the new predicate.
2022-07-25 19:38:01 +10:00
Julien Fischer
31d5a2bef9 Fix typos in library documentation.
library/integer.m:
library/psqueue.m:
library/string.m:
    s/This modules/This module/
2022-07-25 00:52:24 +10:00
Julien Fischer
270fde9f92 Clarify behaviour of string.duplicate_char.
library/string.m:
    Describe what is returned when duplicate_char is called with
    a negative count.
2022-06-01 23:52:40 +10:00
Zoltan Somogyi
0186a64520 Warn for unneeded use of mode-specific clauses.
compiler/add_clause.m:
    Generate a warning for mode-specific clauses when the clause is for
    a predicate that has only one mode, provided that the warning is enabled.

compiler/options.m:
    Add an option to enable this warning.

doc/user_guide.texi:
    Document this option.

library/exception.m:
library/int.m:
library/rtti_implementation.m:
library/string.m:
    Delete modes from clause heads that would get this warning.

tests/valid/spurious_purity_warning.m:
    Delete modes from clause heads that would get this warning.

    Do not interleave predicate definitions.

tests/warnings/unneeded_mode_specific_clause.{m,exp}:
    A test case for this warning.
tests/warnings/Mmakefile:
    Enable the new test case.

tests/invalid/multimode_syntax.err_exp:
    Expect the new warning.
2022-04-13 23:39:23 +10:00
Zoltan Somogyi
01da05dd9c Improve comments. 2022-04-13 18:06:20 +10:00
Julien Fischer
dcb9745d64 Delete obsolete procedures from string module.
library/string.m:
    Delete procedures that have been marked as obsolete since 2019.

NEWS:
    Announce the deletions.

tests/hard_coded/string_append_ooi.m:
tests/hard_coded/string_append_ooi_ilseq.m:
tests/hard_coded/string_presuffix.{m,exp}:
    Conform to the above changes.
2022-04-13 16:54:14 +10:00
Zoltan Somogyi
8ff61f8a4b Delete quotes from `VarNames' in stdlib comments.
In the Mercury standard library, every exported predicate or function
has (or at least *should* have) a comment that documents it, including
the meanings of its arguments. About 35-40% of these modules put `'s
(left and right quotes) around the names of the variable representing
those arguments. Some tried to do it consistently (though there were spots
with unquoted or half quoted names), while some did it only a few places.
This is inconsistent: we should either do it everywhere, or nowhere.
This diff makes it nowhere, because

- this is what the majority of the standard library modules do;
- this is what virtually all of the modules in the compiler, profiler,
  deep_profiler etc directories do;
- typing all those quotes when adding new predicates in modules that
  follow this convention is a pain in the ass; and because
- on many modern terminals, `' looks non-symmetrical and weird.

Likewise, the comment explaining a predicate often started with

    % `predname(arguments)' returns ...

This diff deletes these quotes as well, since they add nothing useful.

This diff does leave in place quotes around code fragments, both terms
and goals, where this helps delineate the boundaries of that fragment.
2022-03-07 11:49:00 +11:00
Zoltan Somogyi
06f81f1cf0 Add end_module declarations ...
.. to modules which did not yet have them.
2022-01-09 10:36:15 +11:00
Julien Fischer
cd540da1f2 Speed up uint32 to string conversion in C grades.
In C grades, uint32_to_string/1 is currently implemented by doing the
following:

 1. Calling sprintf() to write the string into a buffer on the stack.
 2. Calling strlen() on the buffer to determine the actual number of digits.
 3. Allocate the appropriate amount of space on the heap.
 4. Copying the string from the buffer on to the heap.

The benchmark included in this diff contains a number of alternative
implementations, all of which avoid much of the above overhead.
All of these implementations work by computing the required number of digits,
allocating space on the heap and then writing the digits into the string
backwards.

library/string.m:
    Replace uint32_to_string with a faster implementation.
    (This is alternative 1 in the benchmark program below.)

tests/hard_coded/Mmakefile:
tests/hard_coded/uint32_to_string.{m,exp}:
    A test of uint32 to string.

benchmarks/progs/integer_to_string/uint32_conversion.m:
    A program for benchmarking various uint32 to string conversion
    implementations.
2021-10-27 22:28:01 +11:00
Julien Fischer
2ea7e31475 Faster int to decimal string conversion.
The int_to_string conversion operation is currently implemented using
int_to_base_string/3. For the other integer types, we implement the equivalent
conversion using library code in the target languages. Benchmarking
int_to_string shows that it is *much* slower than the equivalent target
language code. When converting 10,000,000 randomly generated (64-bit) integers,
the timings in asm_fast.gc are:

    int_to_base_string: 37100 ms
       target language: 8920 ms

(Repeated six times using benchmark_det_io/7).

library/string.m:
    Implement int_to_string using foreign code.
2021-10-17 17:16:47 +11:00
Zoltan Somogyi
3d2a22382f Speed up converting strings to ints/uints.
library/string.m:
    Test whether base string is between 2 and 36 just once.

library/char.m:
    Provide a predicate to convert a character to a base digit
    that does not check whether the base is between 2 and 36,
    leaving that to the caller.

    Make the error message for bases not in that range more informative,
    in the predicates that *do* check the base.

NEWS:
    Announce unsafe_base_digit_to_int.
2021-07-06 12:04:52 +10:00
Zoltan Somogyi
4926b38ea6 Add uint->uintN conversions.
library/uint16.m:
library/uint32.m:
library/uint64.m:
library/uint8.m:
    Add to each of these modules whichever subset of {from_uint, det_from_uint,
    cast_from_uint} makes sense for that module.

tests/hard_coded/from_uint_uint8.{m,exp}:
tests/hard_coded/from_uint_uint16.{m,exp}:
tests/hard_coded/from_uint_uint32.{m,exp}:
    Three new test cases to test the changes to uint{8,16,32}.m.
    The change to uint64.m will be tested later.

tests/hard_coded/Mmakefile:
    Enable the new test cases.

library/string.m:
    Add ways to convert strings to uints.

NEWS:
    Announce the new predicates and functions.
2021-06-30 23:55:56 +10:00
Zoltan Somogyi
0d7c8a7654 Specify pred or func for all pragmas.
*/*.m:
    As above.

configure.ac:
    Require the installed compiler to support this capability.
2021-06-16 15:23:58 +10:00
Zoltan Somogyi
b55f201fb2 Fix bug in shadowed Mercury code. 2021-06-02 14:29:35 +10:00
Zoltan Somogyi
8fa46514d8 Simplify reading files as lines.
library/io.m:
    Add two new predicates.

    The new predicate read_named_file_as_string reads in a named file,
    and returns its contents as a single giant string. This is effectively
    a combination of open_input, read_file_as_string, and close_input.

    The new predicate read_named_file_as_lines reads in a named file,
    and returns its contents as a list of strings, each containing
    the contents one line (minus the newline char at the end, if any).
    This is effectively a combination of open_input, read_file_as_string,
    split_into_lines (see below), and close_input.

library/string.m:
    Add a new function, split_into_lines, which breaks up a string into
    its constituent lines, returning each line minus its newline.

    This differs from the old split_at_char('\n', ...) in that it
    expects the input string to end with a newline, and does not return
    an empty string after a final newline. (If there *are* characters
    after the final newline, it does return *them* as the final line.)

NEWS:
    Announce the new predicates and function.

compiler/source_file_map.m:
    Use the new functionality to simplify the code that reads in
    Mercury.modules files.

tests/hard_coded/string_split.{m,exp}:
    Add tests for split_into_lines.
2021-04-15 02:40:01 +10:00
Zoltan Somogyi
040d6717a6 Fix comments. 2021-01-26 23:22:54 +11:00
Zoltan Somogyi
9c248726a6 Add uint{64,}_to_lc_hex_string.
library/string.m:
    We long had uint_to_hex_string and uint_to_uc_hex_string. Add
    uint_to_lc_hex_string as well, and make uint_to_hex_string call it.
    This way, users don't have to remember which of the upper and lower
    case versions is defined, and which is missing.

    Do the same for the 64 bit version.

NEWS:
    Announce the new functions.

library/string.format.m:
    Call the new functions.
2021-01-22 17:18:57 +11:00
Julien Fischer
52b31f5089 Add uint64 to string conversion for bases 8 and 16.
library/string.m:
     Add functions for converting uint64s to strings of base 8 or base 16
     digits. For most integer types we can cast to a uint and then use the
     uint versions of these operations but for 64-bit types we cannot since
     on some of our supported platforms uints are 32-bit.

NEWS:
     Announce the additions.

tests/hard_coded/Mmakefile:
tests/hard_coded/uint64_string_conv.{m,exp}:
     Add a test of the new functions.
2020-12-15 22:45:31 +11:00
Julien Fischer
f8e65add3a Format uints directly.
Currently, the Mercury implementation of string formatting handles uints by
casting them to ints and then using the code for formatting signed integers as
unsigned values.  Add an implementation that works directly on uints and make
the code that formats signed integers as unsigned integers use that instead.
The new implementation is simpler and avoids unnecessary conversions to
arbitrary precision integers.

Add new functions for converting uint values directly to octal and hexadecimal
strings that use functionality provided by the underlying platforms; replace
the Mercury code that previously did that with calls to these new functions.

library/string.m:
    Add the functions uint_to_hex_string/1, uint_to_uc_hex_string/1 and
    uint_to_octal_string/1.

library/string.format.m:
    Make format_uint/6 operate directly on uints instead of casting the value
    to a signed int and calling format_unsigned_int/6.

    Make format_unsigned_int/6 cast the int value to a uint and then call
    format_uint/6.

    Delete predicates and functions used to convert ints to octal and
    hexadecimal strings.  We now just use the functions exported by
    the string module.

NEWS:
    Announce the additions to the string module.

tests/hard_coded/Mmakefile:
tests/hard_coded/uint_string_conv.{m,exp*}:
     Add a test of uint string conversion.
2020-11-20 23:07:52 +11:00
Julien Fischer
8f35be65f5 Delete default Mercury clauses previously used for the Erlang backend.
library/string.m:
    As above.
2020-11-14 14:39:08 +11:00
Zoltan Somogyi
d4861d739d Allow formatting of sized integers.
library/string.m:
    Add {i,u}{8.16,32,64} as function symbols in the poly_type type,
    each with a single argument containing an integer with the named
    signedness and size.

    The idea is that each of these poly_type values works exactly
    the same way as the i(_) poly_type (if signed) or the u(_) poly_type
    (if unsigned), with the exception that the value specified by the call
    is cast to int or uint before being processed.

library/string.parse_runtime.m:
    Parse the new kinds of poly_types. Change the representation of the result
    of the parsing to allow recording of the sizes of ints and uints.

    Put the code that does the parsing into a predicate of its own.

library/string.format.m:
    Do a cast to int or uint if the size information recorded in the
    specification of a signed or unsigned integer value calls for it.

    Provide functions to do the casting that do not require the import
    of {int,uint}{8,16,32,64}.m. This is to allow the compiler to generate
    calls to do such casts without having to implicitly import those modules.

    Abort if a 64 bit number is being cast to a 32 bit word.

compiler/parse_string_format.m:
    Make the same changes as in string.parse_runtime.m, mutatis mutandis.

compiler/format_call.m:
    Handle the new kinds of poly_types by adding a cast to int or uint
    if necessary, using the predicates added to library/string.format.m.

    Use a convenience function to make code creating instmap deltas
    more readable.

library/io.m:
library/pprint.m:
library/string.parse_util.m:
tests/invalid/string_format_bad.m:
tests/invalid/string_format_unknown.m:
    Conform to the changes above.

tests/string_format/string_format_d.m:
tests/string_format/string_format_u.m:
    Test the printing of some of the new poly_types.

tests/string_format/string_format_d.exp2:
tests/string_format/string_format_u.exp2:
    Update the expected output of these tests on 64-bit platforms.

tests/string_format/string_format_lib.m:
    Update programming style.
2020-11-10 11:00:47 +11:00
Peter Wang
0d3fcbaae3 Delete Erlang code from library/mdbcomp/browser directories.
library/*.m:
    Delete Erlang foreign code and foreign types.

    Delete documentation specific to Erlang targets.

library/deconstruct.m:
    Add pragma no_determinism_warning to allow functor_number_cc/3
    to compile for now.

library/Mercury.options:
    Delete workaround only needed when targetting Erlang.

browser/listing.m:
mdbcomp/rtti_access.m:
    Delete Erlang foreign code and foreign types.
2020-10-28 14:10:56 +11:00
Zoltan Somogyi
a36eed702d Add add_suffix to the standard library.
compiler/write_deps_file.m:
library/string.m:
    Move a generally-useful function to the library.

NEWS:
    Announce the addition.
2020-10-19 15:52:47 +11:00
Julien Fischer
9528f326d2 Formatting of uints using string.format etc.
Extend the operations that perform formatted conversion, such as
string.format/2, to be able to handle values of type uint directly. We have
always supported formatting values of type int as unsigned values, but
currently the only way to format uint values is by explicitly casting them to
an int. This addresses Mantis issue #502.

library/string.m:
    Add a new alternative to the poly_type/0 type that wraps uint
    values.

    Update the documentation for string.format. uint values may
    now be formatted using the u, x, X, o or p  conversion specifiers.

library/string.format.m:
   Add the necessary machinery for handling formatting of uint values.

library/string.parse_runtime.m:
library/string.parse_util.m:
   Handle uint poly_types.

library/io.m:a
   Handle uint values in the write_many predicates.

library/pprint.m:
   Handle uint values in the poly/1 function.

compiler/format_call.m:
compiler/parse_string_format.m:
    Conform to the above changes.

compiler/options.m:
    Add a way to detect if a compiler supports this change.

NEWS:
    Announce the above changes.

tests/hard_coded/stream_format.{m,exp}:
    Extend this test to cover uints.

tests/invalid/string_format_bad.m:
tests/invalid/string_format_unknown.m:
    Conform to the above changes.

tests/string_format/Mmakefile:
tests/string_format/string_format_uint_o.{m,exp,exp2}:
tests/string_format/string_format_uint_u.{m,exp,exp2}:
tests/string_format/string_format_uint_x.{m,exp,exp2}:
   Add tests of string.format with uints.
2020-05-23 14:01:01 +10:00
Zoltan Somogyi
a6228a9e1a Fix too-long lines. 2020-04-10 03:22:40 +10:00
Zoltan Somogyi
a2bdcece54 Improve English in some comments. 2020-04-07 22:24:00 +10:00
Peter Wang
ff0c363ea4 Define int to string conversions more precisely.
library/string.m:
    As above.
2020-01-21 16:19:27 +11:00
Peter Wang
7d52b9f593 Announce recent changes to string type and string module.
NEWS:
    Announce changes regarding ill-formed code unit sequences in
    strings.

library/string.m:
    Delete a note about ongoing work.
2019-11-19 14:23:15 +11:00
Peter Wang
78da14c581 Add string indexing predicates that indicate a code unit was replaced.
library/string.m:
    Add index_next_repl, unsafe_index_next_repl, prev_index_repl,
    unsafe_prev_index_repl predicates that return an indication if a
    replacement character was returned because an ill-formed code unit
    sequence was encountered.

    Add more pragma inlines for indexing predicates.

    Remove may_not_duplicate attribute on the Erlang version of
    unsafe_prev_index_repl, which would conflict with the pragma inline
    declaration. This requires the helper function do_unsafe_prev_index
    to be exported.

tests/hard_coded/string_append_ooi_ilseq.m:
tests/hard_coded/string_set_char_ilseq.m:
    Use index_next_repl in test cases.

NEWS:
    Announce additions.
2019-11-19 14:23:15 +11:00
Peter Wang
9a042f4fb1 Minor documentation changes.
library/string.m:
    Add missing word.

    Just write "code points" instead of "character" followed by
    clarification in a few spots.

    Delete _underscores_ which aren't particularly helpful.
2019-11-14 15:45:40 +11:00
Peter Wang
7ef407e937 Enable pragma obsolete_proc declarations.
library/string.m:
    Enable pragma obsolete_proc declarations since we now require a
    recent enough compiler version.
2019-11-14 11:28:25 +11:00
Peter Wang
5c3b392ed0 Implement string.(un)capitalize_first more efficiently.
library/string.m:
    Avoid creating temporary string in capitalize_first and
    uncapitalize_first.
2019-11-12 17:16:50 +11:00
Peter Wang
f71b5f20ed Define behaviour of string.set_char etc on ill-formed sequences.
library/string.m:
    Define behaviour of set_char, det_set_char and unsafe_set_char on
    ill-formed sequences. Also define them to throw an exception on an
    attempt to set a null character or surrogate code point in a UTF-8
    string.

    Delete claim that unsafe_set_char is constant time. That would only
    be true for the destructive mode of unsafe_set_char, and that mode
    has been disabled for a long time.

    Implement the defined behaviour for C and C# versions of
    unsafe_set_char. The Java version already behaved as defined.

    Use unsafe_set_char to implement set_char instead of duplicating
    foreign code.

    Replace a couple of uses of strcpy with MR_memcpy as it was
    convenient to do so. (On OpenBSD, the linker issues a warning
    whenever strcpy is used. Avoiding the warning is not high priority
    but we might still like to eliminate all uses of strcpy eventually.)

tests/hard_coded/Mmakefile:
tests/hard_coded/string_set_char_ilseq.exp:
tests/hard_coded/string_set_char_ilseq.exp2:
tests/hard_coded/string_set_char_ilseq.m:
    Add test case.
2019-11-12 17:16:34 +11:00
Peter Wang
ae2dda693e Avoid range checks in string.split_at_separator.
library/string.m:
    Avoid unnecessary range checks in split_at_separator.
2019-11-08 14:25:23 +11:00
Peter Wang
b68548d4dc Avoid garbage in Mercury versions of string.append_list/join_list.
library/string.m:
    Use unsafe_append_string_pieces in Mercury implementations of
    append_list and join_list. This has no practical effect as we have
    foreign code implementations of both, for all target languages.
2019-11-08 14:25:23 +11:00
Peter Wang
68ae33c426 Avoid intermediate strings in string.replace_all.
library/string.m:
    Implement string.replace_all using unsafe_append_string_pieces to
    avoid intermediate strings. Use unsafe_sub_string_search_start to
    avoid repeated range checks.
2019-11-08 14:25:23 +11:00