Commit Graph

5 Commits

Author SHA1 Message Date
Peter Wang
aaa6ac5fe1 Introduce io.system_error to io.m public interface.
Implement the error handling proposals from February 2022 on the
mercury-users list, and August 2022 on the mercury-reviews list.

We add io.system_error to the public interface of io.m
and document what its foreign representation is for each backend.

We allow io.error to optionally contain an io.system_error value,
and provide predicates to retrieve the io.system_error from an io.error.
The user may then inspect the system error via foreign code.

We also provide a predicate that takes an io.error and returns a name
for the system error it contains (if any). This makes it relatively easy
for Mercury programs to check for specific error conditions.

By returning platform-specific (actually, implementation-dependent)
error names, we are pushing the responsibility of mapping strings to
error conditions onto the application programmer. On the other hand, it
is not practical for us to map all possible system-specific error codes
to some common set of values. We could do it for a small set of common
error codes/exceptions, perhaps.

The standard library will construct io.error values containing
io.system_errors. However, we do not yet provide a facility for user
code to do the same.

library/io.m:
    Move io.system_error to the public interface.

    Change the internal representation of io.error to support containing
    a io.system_error. An io.system_error may originate from an errno
    value or a Windows system error code; the constructor distinguishes
    those cases.

    Add predicates to retrieve a system_error from io.error.

    Add predicate to return the name of the system error in an io.error.

    Replace make_err_msg with make_io_error_from_system_error.

    Replace make_maybe_win32_err_msg with
    make_io_error_from_maybe_win32_error.

    Delete ML_make_err_msg and ML_make_win32_err_msg macros.

browser/listing.m:
library/bitmap.m:
library/dir.m:
library/io.call_system.m:
library/io.environment.m:
library/io.file.m:
library/io.text_read.m:
mdbcomp/program_representation.m:
    Conform to changes.

    Leave comments for followup work.

tools/generate_errno_name:
tools/generate_windows_error_name:
    Add scripts to generate mercury_errno_name.c and
    mercury_windows_error_name.c.

runtime/Mmakefile:
runtime/mercury_errno_name.c:
runtime/mercury_errno_name.h:
runtime/mercury_windows_error_name.c:
runtime/mercury_windows_error_name.h:
    Add MR_errno_name() and MR_win32_error_name() functions,
    used by io.m to convert error codes to string names.

tests/hard_coded/null_char.exp:
    Update expected output.
2022-08-23 16:39:48 +10:00
Peter Wang
025bee0549 Check for surrogates when converting list of char to string.
library/string.m:
    Make from_char_list, from_rev_char_list, to_char_list throw an
    exception if the list of chars includes a surrogate code point that
    cannot be encoded in a UTF-8 string.

    Make semidet_from_char_list, semidet_from_rev_char_list,
    to_char_list fail if the list of chars includes a surrogate code
    point that cannot be encoded in a UTF-8 string.

runtime/mercury_string.h:
    Document return value of MR_utf8_width.

tests/hard_coded/Mmakefile:
tests/hard_coded/string_from_char_list_ilseq.exp:
tests/hard_coded/string_from_char_list_ilseq.exp2:
tests/hard_coded/string_from_char_list_ilseq.m:
    Add test case.

tests/hard_coded/null_char.exp:
    Expect new message in exceptions thrown by from_char_list,
    from_rev_char_list.

tests/hard_coded/string_hash.m:
    Don't generate surrogate code points in random strings.
2019-10-30 11:21:02 +11:00
Zoltan Somogyi
7d486b978a Update a test case after Paul's change to string.m. 2016-04-10 06:17:07 +10:00
Simon Taylor
5647714667 Make all functions which create strings from characters throw an exception
Estimated hours taken: 15
Branches: main

Make all functions which create strings from characters throw an exception
or fail if the list of characters contains a null character.

This removes a potential source of security vulnerabilities where one
part of the program performs checks against the whole of a string passed
in by an attacker (processing the string as a list of characters or using
`unsafe_index' to look past the null character), but then passes the string
to another part of the program or an operating system call that only sees
up to the first null character.  Even if Mercury stored the length with
the string, allowing the creation of strings containing nulls would be a
bad idea because it would be too easy to pass a string to foreign code
without checking.

For examples see:
<http://insecure.org/news/P55-07.txt>
<http://www.securiteam.com/securitynews/5WP0B1FKKQ.html>
<http://www.securityfocus.com/archive/1/445788>
<http://www.securityfocus.com/archive/82/368750>
<http://secunia.com/advisories/16420/>

NEWS:
	Document the change.

library/string.m:
	Throw an exception if null characters are found in
	string.from_char_list and string.from_rev_char_list.

	Add string.from_char_list_semidet and string.from_rev_char_list_semidet
	which fail rather throwing an exception.  This doesn't match the
	normal naming convention, but string.from_{,rev_}char_list are widely
	used, so changing their determinism would be a bit too disruptive.

	Don't allocate an unnecessary extra word for each string created by
	from_char_list and from_rev_char_list.

	Explain that to_upper and to_lower only work on un-accented
	Latin letters.

library/lexer.m:
	Check for invalid characters when reading Mercury strings and
	quoted names.

	Improve error messages by skipping to the end of any string
	or quoted name containing an error.  Previously we just stopped
	processing at the error leaving an unmatched quote.

library/io.m:
	Make io.read_line_as_string and io.read_file_as_string return
	an error code if the input file contains a null character.

	Fix an XXX: '\0\' is not recognised as a character constant,
	but char.det_from_int can be used to make a null character.

library/char.m:
	Explain the workaround for '\0\' not being accepted as a char
	constant.

	Explain that to_upper and to_lower only work on un-accented
	Latin letters.

compiler/layout.m:
compiler/layout_out.m:
compiler/c_util.m:
compiler/stack_layout.m:
compiler/llds.m:
compiler/mlds.m:
compiler/ll_backend.*.m:
compiler/ml_backend.*.m:
	Don't pass around strings containing null characters (the string
	tables for the debugger).  This doesn't cause any problems now,
	but won't work with the accurate garbage collector.  Use lists
	of strings instead, and add the null characters when writing the
	strings out.

tests/hard_coded/null_char.{m,exp}:
	Change an existing test case to test that creation of a string
	containing a null throws an exception.

tests/hard_coded/null_char.exp2:
	Deleted because alternative output is no longer needed.

tests/invalid/Mmakefile:
tests/invalid/null_char.m:
tests/invalid/null_char.err_exp:
	Test error messages for construction of strings containing null
	characters by the lexer.

tests/invalid/unicode{1,2}.err_exp:
	Update the expected output after the change to the handling of
	invalid quoted names and strings.
2007-03-18 23:35:04 +00:00
Fergus Henderson
8c6f60daba Fix a bug where we were generating C code that contained special
Estimated hours taken: 2
Branches: main

Fix a bug where we were generating C code that contained special
characters in string literals.  This generated code was relying on
the implementation-specific behaviour of GCC, and unfortunately
that behaviour changed in GCC versions 2.96 and later.
The symptom was that printing "\r\n" came out as "\n\n"
when using GCC versions >= 2.96.

compiler/c_util.m:
	Change the code used to implement quote_char, quote_string,
	and quote_multi_string so that these routines properly escape
	all special characters, rather than just \" \' \n \b and \t.
	(This required changing the output argument type for quote_char
	from a character to a string.)
	Add output_quoted_char, for use by layout_out.m.

compiler/layout_out.m:
	Use c_util__output_quoted_char, rather than duplicating the
	logic in c_util.m.

tests/hard_coded/Mmakefile:
tests/hard_coded/special_char.m:
tests/hard_coded/special_char.exp:
	Regression test.

tests/hard_coded/Mmakefile:
tests/hard_coded/null_char.m:
tests/hard_coded/null_char.exp:
tests/hard_coded/null_char.exp2:
	Add a test of outputting strings containing null characters.
	Note that currently we don't handle this correctly;
	we ignore everything after the first null character.
	So the ".exp2" file for this test case allows that output.
	If/when this is fixed, the ".exp2" file for this
	test case should be removed.
2002-08-06 00:30:52 +00:00