Estimated hours taken: 24
Add inline assembler hacks for restoring the $gp register on the alpha,
to make the `asm_fast.gc' grade work on the alpha.
runtime/goto.h:
For the alpha, change the GOTO macro so that it sets up
register $27 (the "procedure value" register) to contain the
address to which you are going to jump, and change the
Define_{entry,local,static} macros so that they have a `ldgp'
instruction after the label which restores the gp register from
$27.
runtime/call.mod:
For the alpha, change the call macro so that it uses a `ldgp'
instruction to restore the gp register on return from a call.
Add new call_localret macro, for the case when the continuation
label is a local label (this is like the inverse of the localcall
macro), because a call_localret is two instructions cheaper
than the equivalent call would be.
runtime/engine.mod:
In call_engine(), we need to use `Define_label(engine_done)'
rather than just `engine_done:', so that it works on the alpha.
These changes improve the speed of the compiler by about 21%,
and cut executable size by about 24%.
There's still a fair bit of room for improving efficiency by
avoiding moves to $27 and `ldgp' instructions in situations
where they aren't needed.
Estimated hours taken: 1
library/io.m:
Moved the profiling initialisation code so that it sits around
the call to main/2(0). Thus the profiler will now begin all it's
profiles from main not from call_engine.
profiler/read.m:
The profiler now expects that it's address are written out in decimal.
read.m now assumes that an address is a decimal, but if it can't parse
it, it will then read it in as a hexadecimal. Thus the profiler will
still be able to read it's old files.
NB. A possible bug with this is that two numbers will get converted to
the same number if one is treated as hex and the other decimal.
profiler/.cvsignore:
Some more files to ignore.
runtime/{engine, wrapper}.mod:
Remove the profiling initialisation code.
runtime/imp.h:
Add a new call macro noprof_call() which doesn't record an arc
in the call graph for profiling ever. This is used to ignore some
of the initial system set-up calls.
runtime/prof.c:
Output integers instead of hexs for the addresses.
imp.h, engine.mod, io_rt.mod:
Reorder #include lines so that "regs.h" gets included
before any of the system header files (other than <stddef.h>).
This is necessary because on some systems, the system header files
contain inline functions, and the global register variable
declarations must precede these.
regs.h:
Add case for i386, since global register variables now work
on the 386 (the problem was just that we needed -fno-builtin).
label.c, wrapper.mod:
Add #include of <string.h> for strcmp().
(These missing #include were not previously noted because gcc has
a builtin-declaration for strcmp() - it only caused a warning
when I compiled with `-fno-builtin'.)
regorder.h:
Change the register allocation order so that `sp' gets
allocated first, since for the compiler, 36% of all Mercury
register references are to `sp'.
memory.c:
With `-dm', if we're using CONSERVATIVE_GC, don't output the
size etc. for the heap, since they will all be zero - the
heap gets allocated by the Boehm collector.
runtime/{engine.h,engine.mod,memory.c,wrapper.mod}:
Add a new option `-dt' for use with the modes that
don't use gcc non-local gotos. If this option is
enabled, the runtime system will use the slow driver
loop rather than the unrolled one, and if a seg fault
occurs it will print out the last 40 locations.
(This is most useful in combination with -DDEBUG_LABELS,
otherwise you'll just get hex addresses.)
conf.h.in:
Set up the type of Word at configuration time.
imp.h:
Changed definition of Word. Also gathered two of the three
definitions of hash_string into one place, defining a macro
for use by aux.c.
aux.c:
Use the macro in imp.h for the body of hash_string.
engine.mod:
Minor formatting change.
Mmake:
Separated out the handwritten .c files from those generated from .mods.
Added a rule for invoking ctags.
engine.c:
Remove the label do_reset_framevar0_fail, an anachronism. Some
formatting changes.
io_rt.mod:
Flush stdout when processing error/1.
memory.c:
Print out the PC in the SIGBUS handler as well as the SIGSEGV handler.
Some formatting changes.
stacks.h:
Cast pointers to nondet frames to (Word *) before use. If they are
stored in e.g. detstackvars, their native type is just Word.
table.c:
Strengthened debugging capability. A check for whether the hash value
is within bounds is now turned on by default. Later we can turn it off
again.
wrapper.mod:
Changed initialization code to always allocate space for the label
table, even if init_modules does not need to be called. This avoids
a crash in some situations.
runtime/engine.mod:
Add some debugging code to keep track of the last 40
locations jumped to in a circular buffer which can
be examined in gdb after a seg fault, if we are not
using non-local gotos.
(This is the same sort of effect as running the program with
-dg and piping the result into tail -40, except that it is
orders of magnitude cheaper.)
CFLAGSFILE:
Added an option to turn time and call profiling on seperately.
imp.h:
Added macro 'update_prof_current_proc'. Also removed stack
implementation of time profiling.
engine.mod:
Added all the init and finish code for time profiling.
label.*:
Changed option from USE_PROFILING to PROFILE_CALLS
prof.*:
Added all the neccesary code to do time profiling.
imp.h:
Redefined the tailcall macro's to include the Caller address for
profiling, and to call the profile function.
*.mod:
Changed all the handwritten tailcall's to conform to the new tailcall.
Need to add makelocalentry macro so that I can save all the handwritten
label names and they can be looked up by mprof.
prof.*:
Added some more comment's.
Cakefile :
Included prof.c and prof.h in the compile
CFLAGSFILE:
Added the USE_PROFILING flag.
engine.mod:
Added calls so that all the profiling info is dumped to a file
prof.*:
Added some comments.
imp.h:
Modified the call and localcall macros so that they save the
callee-caller address pair if USE_PROFILING defined.
*.mod:
Added the extra argument to all calls and local calls for the
handwritten mod files.
Some of these are included in imp.h, some aren't; the criterion is whether
automatically generated modules need the information in them.
Created a system whereby Mercury programs can be automatically compiled.
One part is the Mercury cakefile in examples (and the Cakefile that uses it),
which defines two targets for each Mercury program. These are e.g. nrev_fast
and nrev_debug for nrev.nl. As usual with System, the file Conf has one
line for each program in the directory; the file Entry likewise contains
the default entry point. Mod2init now inserts the default entry point
into the xxx_init.c file, and test_harness picks it up from there.
The other part is the creation and installation of the files and tools
referred to by the Mercury cakefile. These include scripts in /usr/contrib/bin,
the system modules in /usr/contrib/lib/mercury/modules, the system header
files in /usr/contrib/lib/mercury/inc, and two libraries, lib{fast,debug}mer.a
in each of /usr/contrib/lib/mercury/{sun4,sgi}.
specify their sizes on the comand line.
The layout of the memory areas is optimized for reducing conflicts
in primary caches. Any mrXY that are not real registers are #defined
to unreal_reg_XY. Unfortunately, one cannot control the layout of
these variables wrt the cache. #defining excess mrXY as unreal_regs[XY]
avoids conflicts but nevertheless slows down access considerably
due to the longer instruction sequences required.
On SVR4 machines, we now set up a redzone at the end of each area.
The sizes of these areas are configurable by options.
Labels used to be stored in a fixed-size table with linear search.
They are now stored in two hash tables, which can be lookup up
either by address or by name.
The timimg package now counts time in milliseconds and not microseconds,
thus allowing longer runs without overflow.
made childfr be only a local variable inside macros
made calls to get_run_time governed by the OWNTIMER flag
renamed tests to Sanity
removed some overheads from nrev benchmark