Commit Graph

7 Commits

Author SHA1 Message Date
Zoltan Somogyi
f9fe1d5a1c Delete see/seen/tell/told from io.m.
library/io.m:
    As above. They were already marked as obsolete.

NEWS:
    Mention the change.

tests/hard_coded/remove_file.m:
tests/hard_coded/utf8_io.m:
tests/par_conj/dep_par_24.m:
tests/par_conj/dep_par_24b.m:
tests/tabling/mercury_java_parser_dead_proc_elim_bug.m:
tests/tabling/mercury_java_parser_dead_proc_elim_bug2.m:
tests/valid/mercury_java_parser_follow_code_bug.m:
    Replace references to the deleted predicates.
2022-03-05 15:14:27 +11:00
Zoltan Somogyi
c03b11ca48 Update the style of more test cases.
And updated expected outputs for changed line numbers.
2021-07-27 19:29:21 +10:00
Zoltan Somogyi
58ea6ffff2 Delete old obsolete predicates and functions.
library/*.m:
    Specifically, delete any predicates and functions whose `pragma obsolete'
    dates from 2018 or before. Keep the ones that were obsoleted
    only this year or last year.

NEWS:
    Announce the changes.

tests/debugger/io_tab_goto.m:
tests/debugger/tabled_read.m:
tests/declarative_debugger/io_stream_test.m:
tests/declarative_debugger/tabled_read_decl.m:
tests/declarative_debugger/tabled_read_decl_goto.m:
tests/general/array_test.m:
tests/hard_coded/mutable_init_impure.m:
tests/hard_coded/remove_file.m:
tests/tabling/mercury_java_parser_dead_proc_elim_bug.m:
tests/tabling/mercury_java_parser_dead_proc_elim_bug2.m:
tests/valid/mercury_java_parser_follow_code_bug.m:
    Replace references to predicates and functions that this diff deletes
    with their suggested replacements.

    In several test cases, bring the programming style up to date.

tests/hard_coded/shift_test.{m,exp}:
    Most of this test case tested the now-deleted legacy shift operations.
    Replace these with tests of their non-legacy versions, including
    testing for the expected exceptions.

tests/hard_coded/shift_test.{m,exp}:
    Don't pass --no-warn-obsolete when compiling shift_test.m anymore.
2020-08-18 11:57:47 +10:00
Peter Wang
3621cfa650 Delete deprecated substring predicates and functions.
library/string.m:
    Delete long-deprecated substring/3 function and substring/4 predicate.
    The newly introduced `string_piece' type has a substring/3 data
    constructor which takes (start, end) offsets into the base string,
    whereas the function and predicate take (start, count) arguments.
    To reduce potential confusion, delete the deprecated function and
    predicate.

    Delete other deprecated substring predicates and functions as well.

tests/general/Mercury.options:
tests/general/string_foldl_substring.exp:
tests/general/string_foldl_substring.m:
tests/general/string_foldr_substring.exp:
tests/general/string_foldr_substring.m:
tests/hard_coded/Mercury.options:
tests/hard_coded/string_substring.m:
    Delete tests for deprecated predicates.

tests/tabling/mercury_java_parser_dead_proc_elim_bug.m:
tests/tabling/mercury_java_parser_dead_proc_elim_bug2.m:
tests/valid/mercury_java_parser_follow_code_bug.m:
    Replace calls to unsafe_substring with unsafe_between.

NEWS:
    Announce the changes.
2019-11-08 14:25:23 +11:00
Zoltan Somogyi
fdd141bf77 Clean up the tests in the other test directories.
tests/invalid/*.{m,err_exp}:
tests/misc_tests/*.m:
tests/mmc_make/*.m:
tests/par_conj/*.m:
tests/purity/*.m:
tests/stm/*.m:
tests/string_format/*.m:
tests/structure_reuse/*.m:
tests/submodules/*.m:
tests/tabling/*.m:
tests/term/*.m:
tests/trailing/*.m:
tests/typeclasses/*.m:
tests/valid/*.m:
tests/warnings/*.{m,exp}:
    Make these tests use four-space indentation, and ensure that
    each module is imported on its own line. (I intend to use the latter
    to figure out which subdirectories' tests can be executed in parallel.)

    These changes usually move code to different lines. For the tests
    that check compiler error messages, expect the new line numbers.

browser/cterm.m:
browser/tree234_cc.m:
    Import only one module per line.

tests/hard_coded/boyer.m:
    Fix something I missed.
2015-02-16 12:32:18 +11:00
Peter Wang
3788a9d6fb Improve Unicode support.
Branches: main

Improve Unicode support.

Declare that we use the Unicode character set, and UTF-8 or UTF-16 for the
internal string representation (depending on the backend).  User code may be
written to those assumptions.  Other external encodings can be supported in
the future by translating to/from Unicode internally.

The `char' type now represents a Unicode code point.

NOTE: questions about how to handle unpaired surrogate code points, etc. have
been left for later.


library/char.m:
        Define a `char' to be a Unicode code point and extend ranges
        appropriately.

        Add predicates: to_utf8, to_utf16, is_surrogate, is_noncharacter.

	Update some documentation.

library/io.m:
	Declare I/O predicates on text streams to read/write code points, not
	ambiguous "characters".  Text files are expected to use UTF-8 encoding.
	Supporting other encodings is for future work.

        Update the C and Erlang implementations to understand UTF-8 encoding.

	Update Java and C# implementations to read/write code points (Mercury
	char) instead of UTF-16 code units.

	Add `may_not_duplicate' attributes to some foreign_procs.

	Improve Erlang implementations of seeking and getting the stream size.

library/string.m:
	Declare the string representations, as described earlier.

        Distinguish between code units and code points everywhere.
	Existing functions and predicates which take offset and length
	arguments continue to take them in terms of code units.

        Add procedures: count_code_units, count_codepoints, codepoint_offset,
	to_code_unit_list, from_code_unit_list, index_next, unsafe_index_next,
	unsafe_prev_index, unsafe_index_code_unit, split_by_codepoint,
	left_by_codepoint, right_by_codepoint, substring_by_codepoint.

	Make index, index_det call error/1 if an illegal sequence is detected,
	as they already do for invalid offsets.

	Clarify that is_all_alpha, is_all_alnum_or_underscore,
	is_alnum_or_underscore only succeed for the ASCII characters under each
	of those categories.

        Clarify that whitespace stripping functions only strip whitespace
        characters in the ASCII range.

	Add comments about the future treatment of surrogate code points
	(not yet implemented).

	Use Mercury format implementation when necessary instead of `sprintf'.
	The %c specifier does not work for code points which require multi-byte
	representation.  The field width modifier for %s only works if the
	string contains only single-byte code points.

library/lexer.m:
        Conform to string encoding changes.

        Simplify code dealing with \uNNNN escapes now that encoding/decoding
        is handled by the string module.

library/term_io.m:
        Allow code points above 126 directly in Mercury source.

        NOTE: \x and \o codes are treated as code points by this change.

runtime/mercury_types.h:
        Redefine `MR_Char' to be `int' to hold a Unicode code point.

	`MR_String' has to be defined as a pointer to `char' instead of a
	pointer to `MR_Char'.  Some C foreign code will be affected by this
	change.

runtime/mercury_string.c:
runtime/mercury_string.h:
        Add UTF-8 helper routines and macros.

        Make hash routines conform to type changes.

compiler/c_util.m:
        Fix output_quoted_string_lang so that it correctly outputs non-ASCII
        characters for each of the target languages.

        Fix quote_char for non-ASCII characters.

compiler/elds_to_erlang.m:
        Write out code points above 126 normally instead of using escape
        syntax.

        Conform to string encoding changes.

compiler/mlds_to_cs.m:
        Change Mercury `char' to be represented by C# `int'.

compiler/mlds_to_java.m:
        Change Mercury `char' to be represented by Java `int'.

doc/reference_manual.texi:
        Uncomment description of \u and \U escapes in string literals.

        Update description of C# and Java representations for Mercury `char'
	which are now `int'.

tests/debugger/tailrec1.m:
        Conform to renaming.

tests/general/string_replace.exp:
tests/general/string_replace.m:
	Test non-ASCII characters to string.replace.

tests/general/string_test.exp:
tests/general/string_test.m:
	Test non-ASCII characters to string.duplicate_char,
	string.pad_right, string.pad_left, string.format_table.

tests/hard_coded/char_unicode.exp:
tests/hard_coded/char_unicode.m:
	Add test for new procedures in `char' module.

tests/hard_coded/contains_char_2.m:
	Test non-ASCII characters to string.contains_char.

tests/hard_coded/nonascii.exp:
tests/hard_coded/nonascii.m:
tests/hard_coded/nonascii_gen.c:
        Add code points above 255 to this test case.

	Change test data encoding to UTF-8.

tests/hard_coded/string_class.exp:
tests/hard_coded/string_class.m:
	Add test case for string.is_alpha, etc.

tests/hard_coded/string_codepoint.exp:
tests/hard_coded/string_codepoint.exp2:
tests/hard_coded/string_codepoint.m:
	Add test case for new string procedures dealing with code points.

tests/hard_coded/string_first_char.exp:
tests/hard_coded/string_first_char.m:
	Add test case for all modes of string.first_char.

tests/hard_coded/string_hash.m:
	Don't use buggy random.random/5 predicate which can overflow on
	a large range (such as the range of code points).

tests/hard_coded/string_presuffix.exp:
tests/hard_coded/string_presuffix.m:
	Add test case for string.prefix, string.suffix, etc.

tests/hard_coded/string_set_char.m:
	Test non-ASCII characters to string.set_char.

tests/hard_coded/string_strip.exp:
tests/hard_coded/string_strip.m:
	Test non-ASCII characters to string stripping procedures.

tests/hard_coded/string_sub_string_search.m:
	Test non-ASCII characters to string.sub_string_search.

tests/hard_coded/unicode_test.exp:
        Update expected output due to change of behaviour of
        `string.to_char_list'.

tests/hard_coded/unicode_test.m:
	Test non-ASCII character in separator string argument to
	string.join_list.

tests/hard_coded/utf8_io.exp:
tests/hard_coded/utf8_io.m:
	Add tests for UTF-8 I/O.

tests/hard_coded/words_separator.exp:
tests/hard_coded/words_separator.m:
        Add test case for `string.words_separator'.

tests/hard_coded/Mmakefile:
	Add new test cases.

	Make special_char test case run on all backends.

tests/hard_coded/special_char.exp:
tests/valid/mercury_java_parser_follow_code_bug.m:
	Reencode these files in UTF-8.

NEWS:
	Add a news entry.
2011-04-04 07:10:42 +00:00
Zoltan Somogyi
fc36858e4f Fix a bug that prevented some of the Mercury programs used for the packrat
Estimated hours taken: 0.2
Branches: main

Fix a bug that prevented some of the Mercury programs used for the packrat
paper from compiling. This is the bug description from the new test case:

% This is a regression test. In versions of the compiler before 7 Aug 2007,
% it used to cause this compiler abort:
%
% Software Error: code_gen.m: Unexpected: nondet model in det/semidet context
%
% when generating code for the java_identifier predicate.
%
% The bug was that the follow_code pass would push a goal binding a variable
% into a semidet disjunction. This would cause the disjunction to have an
% output, which would cause the simplify pass to change its determinism,
% and from there the determinism of the entire procedure body, to nondet.
% Since java_identifier is supposed to be semidet, this causes the abort above.

compiler/follow_code.m:
	Fix the bug.

tests/valid/mercury_java_parser_follow_code_bug.m:
	Add the new test case.

tests/valid/Mmakefile:
	Enable the new test case.
2007-08-07 10:03:51 +00:00