Implement the error handling proposals from February 2022 on the
mercury-users list, and August 2022 on the mercury-reviews list.
We add io.system_error to the public interface of io.m
and document what its foreign representation is for each backend.
We allow io.error to optionally contain an io.system_error value,
and provide predicates to retrieve the io.system_error from an io.error.
The user may then inspect the system error via foreign code.
We also provide a predicate that takes an io.error and returns a name
for the system error it contains (if any). This makes it relatively easy
for Mercury programs to check for specific error conditions.
By returning platform-specific (actually, implementation-dependent)
error names, we are pushing the responsibility of mapping strings to
error conditions onto the application programmer. On the other hand, it
is not practical for us to map all possible system-specific error codes
to some common set of values. We could do it for a small set of common
error codes/exceptions, perhaps.
The standard library will construct io.error values containing
io.system_errors. However, we do not yet provide a facility for user
code to do the same.
library/io.m:
Move io.system_error to the public interface.
Change the internal representation of io.error to support containing
a io.system_error. An io.system_error may originate from an errno
value or a Windows system error code; the constructor distinguishes
those cases.
Add predicates to retrieve a system_error from io.error.
Add predicate to return the name of the system error in an io.error.
Replace make_err_msg with make_io_error_from_system_error.
Replace make_maybe_win32_err_msg with
make_io_error_from_maybe_win32_error.
Delete ML_make_err_msg and ML_make_win32_err_msg macros.
browser/listing.m:
library/bitmap.m:
library/dir.m:
library/io.call_system.m:
library/io.environment.m:
library/io.file.m:
library/io.text_read.m:
mdbcomp/program_representation.m:
Conform to changes.
Leave comments for followup work.
tools/generate_errno_name:
tools/generate_windows_error_name:
Add scripts to generate mercury_errno_name.c and
mercury_windows_error_name.c.
runtime/Mmakefile:
runtime/mercury_errno_name.c:
runtime/mercury_errno_name.h:
runtime/mercury_windows_error_name.c:
runtime/mercury_windows_error_name.h:
Add MR_errno_name() and MR_win32_error_name() functions,
used by io.m to convert error codes to string names.
tests/hard_coded/null_char.exp:
Update expected output.
library/string.m:
Make from_char_list, from_rev_char_list, to_char_list throw an
exception if the list of chars includes a surrogate code point that
cannot be encoded in a UTF-8 string.
Make semidet_from_char_list, semidet_from_rev_char_list,
to_char_list fail if the list of chars includes a surrogate code
point that cannot be encoded in a UTF-8 string.
runtime/mercury_string.h:
Document return value of MR_utf8_width.
tests/hard_coded/Mmakefile:
tests/hard_coded/string_from_char_list_ilseq.exp:
tests/hard_coded/string_from_char_list_ilseq.exp2:
tests/hard_coded/string_from_char_list_ilseq.m:
Add test case.
tests/hard_coded/null_char.exp:
Expect new message in exceptions thrown by from_char_list,
from_rev_char_list.
tests/hard_coded/string_hash.m:
Don't generate surrogate code points in random strings.
Estimated hours taken: 15
Branches: main
Make all functions which create strings from characters throw an exception
or fail if the list of characters contains a null character.
This removes a potential source of security vulnerabilities where one
part of the program performs checks against the whole of a string passed
in by an attacker (processing the string as a list of characters or using
`unsafe_index' to look past the null character), but then passes the string
to another part of the program or an operating system call that only sees
up to the first null character. Even if Mercury stored the length with
the string, allowing the creation of strings containing nulls would be a
bad idea because it would be too easy to pass a string to foreign code
without checking.
For examples see:
<http://insecure.org/news/P55-07.txt>
<http://www.securiteam.com/securitynews/5WP0B1FKKQ.html>
<http://www.securityfocus.com/archive/1/445788>
<http://www.securityfocus.com/archive/82/368750>
<http://secunia.com/advisories/16420/>
NEWS:
Document the change.
library/string.m:
Throw an exception if null characters are found in
string.from_char_list and string.from_rev_char_list.
Add string.from_char_list_semidet and string.from_rev_char_list_semidet
which fail rather throwing an exception. This doesn't match the
normal naming convention, but string.from_{,rev_}char_list are widely
used, so changing their determinism would be a bit too disruptive.
Don't allocate an unnecessary extra word for each string created by
from_char_list and from_rev_char_list.
Explain that to_upper and to_lower only work on un-accented
Latin letters.
library/lexer.m:
Check for invalid characters when reading Mercury strings and
quoted names.
Improve error messages by skipping to the end of any string
or quoted name containing an error. Previously we just stopped
processing at the error leaving an unmatched quote.
library/io.m:
Make io.read_line_as_string and io.read_file_as_string return
an error code if the input file contains a null character.
Fix an XXX: '\0\' is not recognised as a character constant,
but char.det_from_int can be used to make a null character.
library/char.m:
Explain the workaround for '\0\' not being accepted as a char
constant.
Explain that to_upper and to_lower only work on un-accented
Latin letters.
compiler/layout.m:
compiler/layout_out.m:
compiler/c_util.m:
compiler/stack_layout.m:
compiler/llds.m:
compiler/mlds.m:
compiler/ll_backend.*.m:
compiler/ml_backend.*.m:
Don't pass around strings containing null characters (the string
tables for the debugger). This doesn't cause any problems now,
but won't work with the accurate garbage collector. Use lists
of strings instead, and add the null characters when writing the
strings out.
tests/hard_coded/null_char.{m,exp}:
Change an existing test case to test that creation of a string
containing a null throws an exception.
tests/hard_coded/null_char.exp2:
Deleted because alternative output is no longer needed.
tests/invalid/Mmakefile:
tests/invalid/null_char.m:
tests/invalid/null_char.err_exp:
Test error messages for construction of strings containing null
characters by the lexer.
tests/invalid/unicode{1,2}.err_exp:
Update the expected output after the change to the handling of
invalid quoted names and strings.
Estimated hours taken: 2
Branches: main
Fix a bug where we were generating C code that contained special
characters in string literals. This generated code was relying on
the implementation-specific behaviour of GCC, and unfortunately
that behaviour changed in GCC versions 2.96 and later.
The symptom was that printing "\r\n" came out as "\n\n"
when using GCC versions >= 2.96.
compiler/c_util.m:
Change the code used to implement quote_char, quote_string,
and quote_multi_string so that these routines properly escape
all special characters, rather than just \" \' \n \b and \t.
(This required changing the output argument type for quote_char
from a character to a string.)
Add output_quoted_char, for use by layout_out.m.
compiler/layout_out.m:
Use c_util__output_quoted_char, rather than duplicating the
logic in c_util.m.
tests/hard_coded/Mmakefile:
tests/hard_coded/special_char.m:
tests/hard_coded/special_char.exp:
Regression test.
tests/hard_coded/Mmakefile:
tests/hard_coded/null_char.m:
tests/hard_coded/null_char.exp:
tests/hard_coded/null_char.exp2:
Add a test of outputting strings containing null characters.
Note that currently we don't handle this correctly;
we ignore everything after the first null character.
So the ".exp2" file for this test case allows that output.
If/when this is fixed, the ".exp2" file for this
test case should be removed.