Files
mercury/tests/hard_coded/string_code_unit.exp
Peter Wang 4d09b5fb8f Conversion of strings to lists of UTF-8/UTF-16 code units.
Add the predicates:

	string.to_utf8_code_unit_list/2
	string.to_utf16_code_unit_list/2
	string.from_utf8_code_unit_list/2
	string.from_utf16_code_unit_list/2

library/string.m:
	Implement the above predicates in Mercury code that works across
	backends.

	Add internal_encoding_is_utf8 helper predicate.

	Forward the new predicates to string.to_code_unit_list and
	string.from_code_unit_list when the named encoding matches that
	of the backend in use.

tests/hard_coded/Mmakefile
tests/hard_coded/string_code_unit.exp
tests/hard_coded/string_code_unit.m
	Add test case.

NEWS:
	Announce the additions.
2015-02-27 12:04:15 +11:00

29 lines
667 B
Plaintext

code points: 0x01, 0x7f
UTF-8: 0x01, 0x7f
UTF-16: 0x01, 0x7f
code points: 0x80, 0x7ff
UTF-8: 0xc2, 0x80, 0xdf, 0xbf
UTF-16: 0x80, 0x7ff
code points: 0x800, 0xffff
UTF-8: 0xe0, 0xa0, 0x80, 0xef, 0xbf, 0xbf
UTF-16: 0x800, 0xffff
code points: 0x100000, 0x10ffff
UTF-8: 0xf4, 0x80, 0x80, 0x80, 0xf4, 0x8f, 0xbf, 0xbf
UTF-16: 0xdbc0, 0xdc00, 0xdbff, 0xdfff
code points: 0x01, 0xd7ff
UTF-8: 0x01, 0xed, 0x9f, 0xbf
UTF-16: 0x01, 0xd7ff
code points: 0xe000, 0xffff
UTF-8: 0xee, 0x80, 0x80, 0xef, 0xbf, 0xbf
UTF-16: 0xe000, 0xffff
code points: 0x10000, 0x10ffff
UTF-8: 0xf0, 0x90, 0x80, 0x80, 0xf4, 0x8f, 0xbf, 0xbf
UTF-16: 0xd800, 0xdc00, 0xdbff, 0xdfff