Enforce the use of the fork path. Having an optional fork path was nice
when working in the shell:
{ok, Child} = alcove:fork(Drv).
Instead of:
{ok, Child} = alcove:fork(Drv, []). % port process forks
However it introduced a few problems:
* made the interface inconsistent and ambiguous
alcove:kill(Drv, Pid, 9)
% vs
alcove:kill(Drv, [], Pid, 9) % the port process is sending the signal
* calls could not have optional arguments
Whether or not calls should have optional arguments is an open question
but the optional fork path would have conflicting arities:
For example, the last argument to mount/8 is used only by Solaris:
% arity 6
-spec mount(alcove_drv:ref(),iodata(),iodata(),iodata(),uint64_t() | [constant()],iodata()) -> 'ok' | {'error', file:posix() | 'unsupported'}.
% arity 7
-spec mount(alcove_drv:ref(),iodata(),iodata(),iodata(),uint64_t() | [constant()],iodata(),iodata()) -> 'ok' | {'error', file:posix() | 'unsupported'}.
% arity 7
-spec mount(alcove_drv:ref(),fork_path(),iodata(),iodata(),iodata(),uint64_t() | [constant()],iodata()) -> 'ok' | {'error', file:posix() | 'unsupported'}.
% arity 8
-spec mount(alcove_drv:ref(),fork_path(),iodata(),iodata(),iodata(),uint64_t() | [constant()],iodata(),iodata()) -> 'ok' | {'error', file:posix() | 'unsupported'}.
* because of the ambiguity in arity, each call can't have an optional
timeout
call/5 sets a timeout of 'infinity'. The timeout isn't accessible from
the "named" functions (e.g., fork, chdir, ...), which means that users
who need the timeout will have to resort to using the call/5
interface. An unfortunate side effect of using call/5 is that dialzyer
won't be able to type check the arguments.
The result of removing the functions without the fork path is that the
code ends up being simpler and more consistent.
Return {error, unsupported} if an atom is used as an argument and the
constant the atom represents does not exist on the current platform.
The previous behaviour was inconsistent and non-deterministic. The
constant might:
* return {error,einval}. System return values could not be distinguished
from alcove return values.
* cause an exception
* be silently ignored
Returning a "plain" error tuple for unsupported calls breaks the type
spec for functions:
On BSD:
-spec setproctitle(alcove_drv:ref(),iodata()) -> 'ok'.
On Linux and Solaris:
-spec setproctitle(alcove_drv:ref(),iodata()) -> {'error','unsupported'}.
So the type spec would need to be amended to be the union of both return
values and portable code would have to test for both cases, with the
effect that any call in the future might return 'unsupported' if alcove
were ported to a new OS.
Attempting to make an supported call will now result in an "undefined
function" exception:
1> catch alcove:setproctitle(P, "foo").
{'EXIT',{undef,[{alcove,call,
[<0.46.0>,setproctitle,["foo"]],
[{file,"src/alcove.erl"},{line,426}]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,661}]},
{erl_eval,expr,5,[{file,"erl_eval.erl"},{line,434}]},
{shell,exprs,7,[{file,"shell.erl"},{line,684}]},
{shell,eval_exprs,7,[{file,"shell.erl"},{line,639}]},
{shell,eval_loop,3,[{file,"shell.erl"},{line,624}]}]}}
2> os:type().
{unix,linux}
This still places a burden on the caller. Portable code either needs to
hardcode supported functions by OS:
% And imagine all the error checking this code is not doing ...
proctitle(Name) ->
case os:type() of
{unix,linux} -> alcove:prctl(...);
{unix,BSD} where BSD = ... -> alcove:proctitle(...)
Or run through sequences of awkward try/catch statements:
proctitle(Name) ->
try alcove:proctitle(...) of
ok -> ok
catch
error:undef ->
try alcove:prctl(...) of
{ok,_,_,_,_,_} -> ok
catch
error:undef ->
% Does Solaris even?
...
end,
end
Dealing with portability should be the job of a higher level library
built on top of alcove.
Mount filesystems using the Solaris-specific version of the mount
interface. Solaris adds an options and options length parameter to
mount which takes a NULL terminated string of comma separated arguments.
On other Unix'es the options are either included in the mount flags
(MS_NOEXEC, ...) or in the data argument (<<"size=128M">> for tmpfs).
The behaviour of mount(2) on Solaris is bizarre: the options argument is
input/output, with the mount options placed in the buffer on return.
If the MS_OPTIONSTR is present in the mount flags and the options buffer
is too small, the mount call returns -1 and ERRNO is set to EOVERFLOW
but the mount actually succeeds! A more robust interface might truncate
the options to the size of the buffer, possibly seting the options
length to the required length and return 0.
Surprisingly, the options buffer can also be too large. This is so
weird, it must be a bug in alcove. If the buffer exceeds a certain size,
mount returns -1 with ERRNO set to EINVAL. The mount fails in this case:
{error,einval} = alcove:mount(Drv, [Child], "swap", Dir, "tmpfs",
[ms_optionstr], <<>>, <<"size=16m", 0:(1024*8)>>).
Since Solaris has this extra argument, mount/6,7 has to be extended to
mount/7,8, forcing all platforms to pass in an options parameter. This
parameter is ignored on all platforms except Solaris. The result is that
the mount interface is not the Linux mount(2) interface, it is some weird
hybrid that is awkward to use on all platforms (the interface does not
map to the mount(2) man page on any platform).
The value of the option parameter is also not returned to the caller on
Solaris. Options to fix this include:
* breaking out Solaris mount(2) to a Solaris specific call (mountext or
whatever)
* checking if opt is non-NULL on return and opt len > 0. If so, return:
{ok, binary()}
This extends the mount/7,8 type to:
ok | {ok,binary()} | {error,posix()}
There is a race condition in signal handling that gives invalid
responses to calls on some systems:
in function alcove_seccomp_tests:'-kill/1-fun-2-'/2 (test/alcove_seccomp_tests.erl, line 76)
**error:{assertEqual_failed,[{module,alcove_seccomp_tests},
{line,76},
{expression,"Reply1"},
{expected,{'EXIT',{termsig,sigsys}}},
{value,ok}]}
alcove_seccomp_tests:77: kill...*failed*
in function alcove_seccomp_tests:'-kill/1-fun-4-'/2 (test/alcove_seccomp_tests.erl, line 77)
**error:{assertEqual_failed,[{module,alcove_seccomp_tests},
{line,77},
{expression,"Reply2"},
{expected,{error,esrch}},
{value,ok}]}
The failed tests indicates the parent process has seen the child
process' control fd has been closed but the child process has either not
yet terminated or the termination status has not been propagated to the
parent.
There are 2 types of calls: calls that return a value and calls that
never return such as execve(2), execvp(2) and exit(2).
For calls that do not return, the closing of the child's control fd
indicates success.
For all other calls, the child should not exit. If the parent generates
a "fdctl_closed" event on behalf of the child, the erlang process blocks
until a termsig event is received.
Pass in an integer or a list of integers/atoms for the mount flags
parameter.
The mount flags is an unsigned integer and the decode function returns
an int. While this is probably ok, there are a number of typecasts in
the lookup code that needs to cleaned up.
Accept an integer or a list of atoms/integers for the flags argument. If
a list is passed, the value are OR'ed together. The values for atoms are
looked up from the constants defined in the system header files.
Accept and return signals as names. For example, if a signal is trapped,
the process will receive:
{signal,'SIGCHLD'}
Versus:
{signal,17} % on linux
Benefits:
* similar to the way errno is returned, e.g., {errno,enametoolong}
* better portability: integer values differ by platform
* library user does not need lookup the signal value before calling
sigaction/3,4 or pattern matching the process mailbox
Problems:
* sigaction/3,4 will throw badarg if the name is unknown
Probably it should return an error ({error,einval}).
* not all functions dealing with signals have been changed to use atoms,
e.g., kill/3,4
* all functions with mapping of constants -> integers should take
constants:
clone_define
mount_define
file_define
prctl_define
rlimit_define
syscall_define
If this is done, alcove:define/2,3 and the *_define/*_constant functions
should be removed.
* should the constants be upper or lower case atoms?
atoms are typically lowercased, e.g., {error,ebadf} not
{error,'EBADF'}
For example:
PR_SET_PDEATHSIG -> pr_set_pdeathsig
* if the interface accepts constants, should support for integers be
removed?
1. No way to pass in a "raw" signum if the signal does not have a name
2. Signals without a name will be sent to the process mailbox as:
{signal,unknown}
If the caller has trapped 58 and 59, there won't be a way to
distinguish these signals.
Replace use of a list to pass arguments to a call with a tuple. Parsing
the elements in the external term format is less error prone. A list, in
the term format, can be:
* a string (a char buf)
* a list header followed by terms, terminated with an empty list
* a list header followed by 1 element, followed by a list header ...,
terminated by an empty list
Since the calls take a fixed number of arguments (none of the vararg
versions are used), a tuple might be a better erlang representation.
It also fits in better with the usage of tuples in other functional
languages, where a tuple can hold any types but a list contains the
same type.
alcove:call(Port, fork, []) -> fork()
alcove:call(Port, exit, [1]) -> exit(1)
alcove:call(Port, execvp, ["/bin/echo", ["/bin/echo",
"foobar"]]) -> execvp("/bin/echo", argv)
alcove:call(Drv, open, ["/tmp/foo", 0, 0]) ->
open("/tmp/foo", 0, 0)
Would become:
alcove:call(Port, fork, {}) -> fork()
alcove:call(Port, exit, {1}) -> exit(1)
alcove:call(Port, execvp, {"/bin/echo", ["/bin/echo",
"foobar"]}) -> execvp("/bin/echo", argv)
alcove:call(Drv, open, {"/tmp/foo", 0, 0}) ->
open("/tmp/foo", 0, 0)
However that exposes the caller to the dreaded single element tuple.
Simplify the C port code by converting the argument list to a tuple.
Convert to static buffers for encoding/decoding the erlang external term
format using ei.
Use of pthreads when fork/exec is involved is risky. After a process
calls fork(), threads may be holding a lock and cause a deadlock. So the
operations that can be performed after a fork (before exec) are
basically limited to the same operations that can be done in a signal
handler. Operations that may be unsafe include malloc and the f* stream
functions.
alcove relied on erl_interface for encoding/decoding the external term
format. Internally, erl_interface uses pthreads.
Since the interesting part of alcove is the set of operations that can
be done post-fork and pre-exec, remove the dependency on erl_interface. The
change was done manually and was pretty massive for something done
by hand. alcove compiles and all the tests pass but there will be
regressions, especially considering not all calls are tested. For example,
cgroups are likely broken.
These changes have been tested only on Linux/32-bit/ARM.
This change had the benefit of simplifying the code and applying limits
to some of the inputs.
ei was used for convenience. It may be better to move to a custom
protocol or building a custom library for decoding the ETF.
ei/the term format has many problems:
* lists of small integers are encoded as strings
A list may be magically converted to a string if the list is comprised
of integers 0-255. Interestingly, 0 is considered to be a string even
though strings are NULL-terminated. So [0] is encoded as a string.
C code must catch both cases and possibly normalize the string into a
list. Even trickier since a list is terminated by an empty list.
* integer overflows
All ei functions dealing with static buffers take an pointer to an
offset. This index is a signed integer which can overflow.
Not a problem for alcove since messages are limited to lengths that can be
represented in 2 bytes.
* buffer overflows
The ei functions do not take a length. While the size of certain types
(strings, binaries, atoms) can be queried using ei_get_type(), other
types report a size of 0. Since the size of these encodings are not
known to the caller, it is questionable what is an appropriately sized
buffer.
The term encoding changes in this commit are sketchy and need to be
cleaned up. Some of the functions encode a complete term, others encode
at an offset and others return an allocated buffer.
To ensure consistency, exit if a timeout is set and reached. Otherwise,
late messages may arrive in the queue, messing up subsequent calls:
1> {ok,P} = alcove_drv:start().
{ok,<0.45.0>}
2> catch alcove:call(P, [], getpid, [], 0).
{'EXIT',timeout}
3> alcove:version(P).
2897
4> flush().
Shell got {alcove_call,<0.45.0>,[],<<"0.6.1">>}
ok
Using timeouts might be needed if it is not known whether is still
running in the event loop.
Remove the cast functions, since they are dangerous and can be emulated
by using call/5.
Fix the code to match the documentation and add a call with timeout.
Simplify the alcove_drv module by removing the versions of call with
defaults. The alcove module does the work of setting default values.