Commit Graph

4 Commits

Author SHA1 Message Date
Julien Fischer
a56c4f5708 Merge integer tokens in lexer.
library/lexer.m:
     Merge the 'integer' and 'big_integer' tokens and extend them to include
     signedness and size information.  This conforms to recent changes to
     the rest of the system and is another step towards supporting additional
     types of integer literal.

library/parser.m:
mdbcomp/trace_counts.m:
     Conform to the above change.

tests/hard_coded/impl_def_lex.exp:
tests/hard_coded/impl_def_lex_string.exp:
tests/hard_coded/lexer_bigint.exp:
tests/hard_coded/lexer_zero.exp*:
tests/hard_coded/parse_number_from_string.exp*:
     Update these expected outputs.
2017-04-26 10:00:45 +10:00
Julien Fischer
e6e295a3cc Generalise the representation of integers in the term module.
In preparation for supporting uint literals and literals for the fixed size
integer types, generalise the representation of integers in the term module, so
that for every integer literal we record its base, value (as an arbitrary
precision integer), signedness and size (the latter two based on the literal's
suffix or lack thereof).

Have the lexer attach information about the integer base to machine sized ints;
we already did this for the 'big_integer' alternative but not the normal one.
In conjunction with the first change, this fixes a problem where the compiler
was accepting non-decimal integers in like arity specifications.  (The
resulting error messages could be improved, but that's a separate change.)

Support uints in more places; mark other places which require further work with
XXX UINT.

library/term.m:
     Generalise the representation of integer terms so that we can store
     the base, signedness and size of a integer along with its value.
     In the new design the value is always stored as an arbitrary precision
     integer so we no longer require the big_integer/2 alternative; delete it.

     Add some utility predicates that make it easier to work with integer terms.

library/term_conversion.m:
library/term_io.m:
     Conform to the above changes,

     Add missing handling for uints in some spots; add XXX UINT comments
     in others -- these will be addressed later.

library/lexer.m:
     Record the base of word sized integer literals.

library/parser.m:
compiler/analysis_file.m:
compiler/fact_table.m:
compiler/get_dependencies.m:
compiler/hlds_out_goal.m:
compiler/hlds_out_util.m:
compiler/intermod.m:
compiler/make.module_dep_file.m:
compiler/parse_class.m:
compiler/parse_inst_mode_name.m:
compiler/parse_item.m:
compiler/parse_pragma.m:
compiler/parse_sym_name.m:
compiler/parse_tree_out_term.m:
compiler/parse_tree_to_term.m:
compiler/parse_type_defn.m:
compiler/parse_util.m:
compiler/prog_ctgc.m:
compiler/prog_util.m:
compiler/recompilation.check.m:
compiler/recompilation.version.m:
compiler/superhomogeneous.m:
mdbcomp/trace_counts.m:
samples/calculator2.m:
extras/moose/moose.m:
    Conform to the above changes.

tests/hard_coded/impl_def_lex.exp:
tests/hard_coded/impl_def_lex_string.exp:
tests/hard_coded/lexer_bigint.exp*:
tests/hard_coded/lexer_zero.exp*:
tests/hard_coded/parse_number_from_string.exp*:
tests/hard_coded/term_to_unit_test.exp:
    Update these expected outputs.
2017-04-22 11:53:14 +10:00
Peter Wang
6a267a8f07 Make base_string_to_int check overflow/underflow for all bases.
Make base_string_to_int check for overflow and underflow when converting
from strings in all bases, not only base 10.  Fixes bug #376.

Previously it was stated that "numbers not in base 10 are assumed to
denote bit patterns and are not checked for overflow."  Though not a
safe assumption in general, in Mercury source files it is useful to be
able to write values with the high bit set, e.g. 0x80000000 on 32-bit
machines, that would be greater than max_int if interpreted as a
positive integer.

The changed behaviour of base_string_to_int would reject such literals
from Mercury sources, so additional changes are required to maintain
that usage.  However, unlike before, the compiler will report an
error if some non-zero bits of the literal would be discarded.

library/string.m:
	Enable overflow/underflow checking for base_string_to_int for
	any base.

	Update documentation.

library/lexer.m:
	Allow `big_integer' token functor to represent non-base 10
	literals as well.

	Add `integer_base' type.

library/term.m:
	Add `big_integer' term functor.

	Add `integer_base' type.

library/term_io.m:
	Add private helper functions `integer_base_int' and
	`integer_base_prefix'.

	Conform to changes.

library/parser.m:
	Pass through `big_integer' tokens as `big_integer' terms.

	Conform to changes.

compiler/prog_util.m:
	Add predicate to convert integer terms to ints with the
	aforementioned concession for bit patterns.

	`make_functor_cons_id' can now fail due to integer tokens
	exceeding the range of `int'.

compiler/superhomogeneous.m:
	Make `unravel_var_functor_unification' convert `big_integer'
	tokens on the RHS to a simple `int' with the aforementioned
	concession for bit patterns, or add an error message if any
	significant bits would be discarded.

compiler/fact_table.m:
compiler/mercury_to_mercury.m:
compiler/module_imports.m:
compiler/prog_io_util.m:
	Conform to changes.

compiler/make.util.m:
	Delete unused predicate.

tests/general/test_string_to_int_overflow.m:
tests/general/test_string_to_int_overflow.exp:
tests/general/test_string_to_int_overflow.exp2:
tests/general/test_string_to_int_overflow.exp3:
        Rewrite test case.

tests/hard_coded/lexer_bigint.exp:
tests/hard_coded/lexer_bigint.exp2:
tests/hard_coded/read_min_int.exp:
tests/hard_coded/read_min_int.exp2:
	Update expected outputs due to the lexer and term module changes.

tests/invalid/Mmakefile:
tests/invalid/invalid_int.err_exp:
tests/invalid/invalid_int.err_exp2:
tests/invalid/invalid_int.m:
	Add new test case.

NEWS:
	Announce the changes.
2015-02-17 12:14:16 +11:00
Peter Wang
0667216eba lexer.m tokenises "-INTEGER" as two tokens, a minus sign and a positive
Branches: main

lexer.m tokenises "-INTEGER" as two tokens, a minus sign and a positive
integer.  This fails when the overall negative value is min_int, i.e. the
absolute value is max_int+1 -- too big to store in an int.

One less obvious consequence of the bug is that io.read could not parse some
plain Mercury terms written out by io.write.

library/lexer.m:
	Add a `token.big_integer' constructor to hold big integer literals in
	their string representation.  Currently this is only done for base 10
	literals which cannot fit in an int.

library/parser.m:
	Parse the token sequence, minus sign followed by big_integer max_int+1,
	as the integer term with value min_int.

tests/hard_coded/Mmakefile:
tests/hard_coded/lexer_bigint.exp:
tests/hard_coded/lexer_bigint.exp2:
tests/hard_coded/lexer_bigint.inp:
tests/hard_coded/lexer_bigint.m:
tests/hard_coded/read_min_int.exp:
tests/hard_coded/read_min_int.exp2:
tests/hard_coded/read_min_int.inp:
tests/hard_coded/read_min_int.m:
	Add test cases.
2012-05-14 06:30:06 +00:00