mercury

mirror of https://github.com/Mercury-Language/mercury.git synced 2026-04-23 21:33:49 +00:00

Author	SHA1	Message	Date
Zoltan Somogyi	ca7385d2c7	Generate better diagnostics for parentheses mismatches. When you have an unclosed (, [ or { in a clause, the diagnostic you got did not tell you - where the unclosed parenthesis was, - which kind of parenthesis it was. Fix this by including both pieces of information in the diagnostic. Likewise, print more useful info for mixed-up parentheses, such as [(]). library/mercury_term_parser.m: When consuming a (, [ or { token, push it and its context on a stack. When consuming a ), ] or } token, pop off the top item from this stack, and generate a diagnostic if the close token does not match it. The one exception from this pushing and pulling is for code that handles the case where the open is followed immediately by the matching close, such as when parsing [] or {}. Print the contents of the stack also when getting to either the end of the term, or the end of the input, with a nonempty stack. Maintaining this stack has a small performance cost, but I expect it to be negligible, especially compared to the usefulness of the new detail in diagnostics, Completely rework the error handling parts of this module. The main changes are the following. First, the old code used to include part of the intended message in the pr_error structures it created, with a "Syntax error: " prefix being added later. Since this makes it hard to ensure that the error messages follow the rules of English, change this to generate each error message all at once. Second, the old code included the list of the remaining tokens in each pr_error structure. This was overkill, because the only part of this list that was used was the id and the context of the first token in the list. Apart from being inelegant, the main flaw of this approach was that in the case of premature end-of-file errors, the only token list available was token_nil, which of course contains neither a token nor its context. The old code compensated for it later by using the context of the first token of the whole term being parsed, which is ... less than useful. (The missing token is trivially replaced by "end-of-file".) The new code replaces the token list with the context, if it is available; if it is not, then later we compute the context of the last token in the whole token list. The new code does not return the token itself; instead, it includes its string version in the generated error message where appropriate. Third, as mentioned above, we now include info about unbalanced (), [] and {} pairs in diagnostics, as extra sentences. (These extra sentences are preceded by \n characters; see the change to parse_module.m below.) Fifth, to make the above possible without adding unnecesary complications, the diagnostic texts this module generates now always include the period at the ends of sentences: they are not added by the compiler. Fourth, we now consistently use "Syntax error at token abc: expected def, fgh, or xyz" phraseology. library/mercury_term_lexer.m: Stop requiring the customers of this module to handle - integer_dot tokens, which are needed only by, and are an implementation detail of, the get_* family of predicates, and - eof tokens, which the lexer also never returns, converting each one into the end of its token list instead. The fact that the lexer never returned integer_dot tokens was documented, but the fact that it never returned eof tokens was not. The reason for this change was simply that I did not want to write two pieces of code to handle the out-of-input case in each affected spot in the parser: once for an eof token, and once for token_nil. library/stack.m: Add a utility function needed by new code in mercury_term_parser.m. compiler/parse_module.m: Stop adding a period at the ends of error messages generated by mercury_term_parser.m; mercury_term_parser.m now adds those itself. Do post-process those messages by turning any \n characters in them into nl format_pieces. NEWS.md: Announce the change in mercury_term_lexer.m, and the new function in stack.m. library/io.text_read.m: Unrelated bug fix, for which I discovered the need while working on the other library files: add a missing foreign import. tests/invalid_nodepend/unbalanced.{m,err_exp}: A new test case to check the updated diagnostics. tests/invalid_nodepend/Mmakefile: Enable the new test case. tests/hard_coded/parse_number_from_string.exp: tests/invalid_nodepend/impl_def_literal_syntax.err_exp: tests/invalid_nodepend/invalid_binary_literal.err_exp: tests/invalid_nodepend/invalid_float_literal.err_exp: tests/invalid_nodepend/invalid_hex_literal.err_exp: tests/invalid_nodepend/invalid_octal_literal.err_exp: tests/invalid_nodepend/null_char.err_exp: tests/invalid_nodepend/typeclass_test_1.err_exp: tests/invalid_nodepend/unicode_1.err_exp: tests/invalid_nodepend/unicode_2.err_exp: tests/invalid_purity/purity_nonsense_2.err_exp: Expect the updated diagnostics.	2025-07-30 01:37:28 +02:00
Zoltan Somogyi	f6e5a438c4	Allow underscores before exponents in floats. library/mercury_term_lexer.m: As above. NEWS.md: Announce the change. doc/reference_manual.texi: Document the change. tests/hard_coded/parse_number_from_string.exp: tests/invalid_nodepend/invalid_float_literal.err_exp: Update these expected outputs after the change.	2023-07-02 02:20:39 +02:00
Zoltan Somogyi	ac57c2f70d	Improve diagnostics for malformed number literals. library/mercury_term_lexer.m: Make the diagnostics for malformed numbers more detailed, and thus more easily understandable. tests/hard_coded/parse_number_from_string.exp: tests/invalid_nodepend/invalid_binary_literal.err_exp: tests/invalid_nodepend/invalid_decimal_literal.err_exp: tests/invalid_nodepend/invalid_float_literal.err_exp: tests/invalid_nodepend/invalid_hex_literal.err_exp: tests/invalid_nodepend/invalid_octal_literal.err_exp: Update the expected error messages.	2023-06-19 01:10:19 +02:00
Julien Fischer	a56c4f5708	Merge integer tokens in lexer. library/lexer.m: Merge the 'integer' and 'big_integer' tokens and extend them to include signedness and size information. This conforms to recent changes to the rest of the system and is another step towards supporting additional types of integer literal. library/parser.m: mdbcomp/trace_counts.m: Conform to the above change. tests/hard_coded/impl_def_lex.exp: tests/hard_coded/impl_def_lex_string.exp: tests/hard_coded/lexer_bigint.exp: tests/hard_coded/lexer_zero.exp: tests/hard_coded/parse_number_from_string.exp: Update these expected outputs.	2017-04-26 10:00:45 +10:00
Julien Fischer	e6e295a3cc	Generalise the representation of integers in the term module. In preparation for supporting uint literals and literals for the fixed size integer types, generalise the representation of integers in the term module, so that for every integer literal we record its base, value (as an arbitrary precision integer), signedness and size (the latter two based on the literal's suffix or lack thereof). Have the lexer attach information about the integer base to machine sized ints; we already did this for the 'big_integer' alternative but not the normal one. In conjunction with the first change, this fixes a problem where the compiler was accepting non-decimal integers in like arity specifications. (The resulting error messages could be improved, but that's a separate change.) Support uints in more places; mark other places which require further work with XXX UINT. library/term.m: Generalise the representation of integer terms so that we can store the base, signedness and size of a integer along with its value. In the new design the value is always stored as an arbitrary precision integer so we no longer require the big_integer/2 alternative; delete it. Add some utility predicates that make it easier to work with integer terms. library/term_conversion.m: library/term_io.m: Conform to the above changes, Add missing handling for uints in some spots; add XXX UINT comments in others -- these will be addressed later. library/lexer.m: Record the base of word sized integer literals. library/parser.m: compiler/analysis_file.m: compiler/fact_table.m: compiler/get_dependencies.m: compiler/hlds_out_goal.m: compiler/hlds_out_util.m: compiler/intermod.m: compiler/make.module_dep_file.m: compiler/parse_class.m: compiler/parse_inst_mode_name.m: compiler/parse_item.m: compiler/parse_pragma.m: compiler/parse_sym_name.m: compiler/parse_tree_out_term.m: compiler/parse_tree_to_term.m: compiler/parse_type_defn.m: compiler/parse_util.m: compiler/prog_ctgc.m: compiler/prog_util.m: compiler/recompilation.check.m: compiler/recompilation.version.m: compiler/superhomogeneous.m: mdbcomp/trace_counts.m: samples/calculator2.m: extras/moose/moose.m: Conform to the above changes. tests/hard_coded/impl_def_lex.exp: tests/hard_coded/impl_def_lex_string.exp: tests/hard_coded/lexer_bigint.exp: tests/hard_coded/lexer_zero.exp: tests/hard_coded/parse_number_from_string.exp*: tests/hard_coded/term_to_unit_test.exp: Update these expected outputs.	2017-04-22 11:53:14 +10:00
Julien Fischer	b3835dd826	Fix bug #430 . The lexer was not allowing underscores between leading zeros in decimal integer literals and float literals (e.g. 0_0 or 0_0.0). library/lexer.m: Allow underscores in the above cases. tests/hard_coded/parse_number_from_io.{m,exp,exp2,exp3}: tests/hard_coded/parse_number_from_string.{m,exp,exp2,exp3}: Extend these tests to cover the above cases. tests/invalid/invalid_binary_literal.err_exp: tests/invalid/invalid_octal_literal.err_exp: tests/invalid/invalid_hex_literal.err_exp: Conform to the above change.	2017-01-29 14:38:26 +11:00
Julien Fischer	d8f0d402fe	Document underscores in numeric literals. Standardize terminology in error messages about ill-formed numeric literals. doc/reference_manual.texi: Document underscores in numeric literals. Add a TODO comment about a future piece of work. NEWS: Announce the addition of support for underscores in numeric literals. library/lexer.m: Use the term "literal" instead of "constant" or "token" when referring to numeric literals in error messages. s/hex/hexadecimal/ in those same error messages. tests/hard_coded/parse_number_from_string.exp*: tests/invalid/invalid_{binary,decimal,hex,octal}_literal.err_exp: Conform to the above change in error messages from the lexer.	2017-01-16 16:24:48 +11:00
Julien Fischer	61c4ef7e50	Allow optional underscores in numeric literals. Allow the optional use of underscores in numeric literals for the purpose of improving their readability (e.g. by grouping digits etc). We allow any number of underscores between digits and also between the radix prefix (if present) and the initial digit. (When integer type suffixes are supported we will also allow them to be preceded by any number of underscores.) The following are not allowed: 1. Leading underscores. 2. Trailing underscores. 3. Underscores inside the components of a radix prefix (e.g. 0_xffff or 0__b101010.) 4. Underscores immediately adjacent to the decimal point in a float literal (e.g. 123_._123.) 5. Underscores immediately adjacent to the exponent ('e' or 'E) in a float literal (e.g. 123_e12 or 123E_12.) 6. Underscores immediately adjacent to the optional sign of an exponent in a float literal (e.g. 123_+e12 or 123-_E12.) 7. Underscores between the optional sign of an exponent and the exponent indicator (e.g. 123+_e12.) library/lexer.m: Modify the scanner to account of underscores in numeric literals according to the scheme above. library/string.m: library/integer.m: Export undocumented functions for converting strings containing underscores into ints or integers respectively. tests/hard_coded/parse_number_from_io.{m,exp}: Test parsing of valid numeric literals from file streams. tests/hard_coed/parse_number_from_string.{m,exp}: Test parsing of valid and invalid numeric literal from string. tests/invalid/invalid_binary_literal.{m,err_exp}: tests/invalid/invalid_decimal_literal.{m,err_exp}: tests/invalid/invalid_octal_literal.{m,err_exp}: tests/invalid/invalid_hex_literal.{m,err_exp}: tests/invalid/invalid_float_literal.{m,err_exp}: Test parsing of invalid numeric literals from file streams. tests/hard_coded/parse_number_from_{io,string}.m: tests/hard_coded/parse_number_from_{io,string}.exp: Test parsing of valid numeric literals. tests/hard_coded/Mmakefile: tests/invalid/Mmakefile: Add the new test cases.	2017-01-12 01:10:31 +11:00

8 Commits