Commit Graph

116 Commits

Author SHA1 Message Date
Michael Santos
882494c3bc Begin adding documentation and examples 2014-04-18 18:02:44 -04:00
Michael Santos
d00c928474 Clean up 2014-04-18 10:08:20 -04:00
Michael Santos
7efd433d87 Move macro to main header 2014-04-18 09:15:21 -04:00
Michael Santos
e2a57fabbf Test if the ctl fd is already open
valgrind will use fd 3 for the log file. Check if the fd has been
opened.
2014-04-16 10:58:57 -04:00
Michael Santos
def2e32755 Fix memory leaks 2014-04-15 16:51:22 -04:00
Michael Santos
b0d7f7b6d7 Resize the poll array if RLIMIT_NOFILE increases
Allocate the poll array once and re-use it. If the caller increases the
number of file descriptors supported by the process, try to account for
it by re-sizing the poll array.

Caveats:

The poll array will only grow. Because the offset into the array is used
for indexing the fd (similar to how the FD_SET macros work with
select(2)), decreasing the array with existing fd's may result in the
fd's exceeding the size of the array.

Using sysconf(_SC_OPEN_MAX) is not portable. On Linux, it seems to call
getrlimit(RLIMIT_NOFILE).

The value of maxfd is checked in the event loop rather than doing it in
the call to setrlimit because there may be other methods for changing
the limit, for example, using prctl.
2014-04-14 10:04:11 -04:00
Michael Santos
5ec13569aa Remove the limit on number of child processes
Base the default number of child processes per fork on the number of
available file descriptors. The maximum number of processes per fork is
an unsigned 2 byte integer.

If this value exceeds the RLIMIT_NOFILE, {error,emfile} will be
returned to the caller.
2014-04-13 13:35:43 -04:00
Michael Santos
f082295272 Replace select(2) with poll(2)
To remove the arbitary limit on file descriptors and consequently,
number of forked processes, use poll(2) instead of select(2).

Using poll(2) has the downside that poll(2) has restrictions on what
type of file descriptors can be monitored, for example, devices on Mac
OS X, since on the roadmap is allowing the user to add fd's to the event
loop.
2014-04-13 13:00:18 -04:00
Michael Santos
5ca6c107ed Pass the process state into the pid iterator funs 2014-04-12 10:28:42 -04:00
Michael Santos
cd2c8ecb31 Add chown(2) 2014-04-11 11:03:48 -04:00
Michael Santos
0b28a1778e Add chmod(2) 2014-04-11 10:57:16 -04:00
Michael Santos
b986ce354f Clarify msg length by using symbolic constant 2014-04-11 10:35:15 -04:00
Michael Santos
e7f6b38b9f Zero the child array twice
Scrub the fd array inherited from the parent before realloc(), in case
the number of fd's in the child has shrunk. Purely cosmetic, nothing
sensitive in this array.
2014-04-11 10:25:13 -04:00
Michael Santos
d05168fd6f Ensure the control fd is fd 3
dup the control fd to the "well known" file descriptor 3. This
simplifies checks in the code (reserved fd's are under 4).

Since dup'ing the fd doesn't copy the close on exec flag to the new fd,
perform the fcntl() operation after the fd is dup'ed.
2014-04-07 12:58:29 -04:00
Michael Santos
4b14631854 Add readdir/2
readdir/2 lists files within a directory. This duplicates some of the
functionality of the file and filelib modules in the stdlib but, since
the process may be running as a different user than beam and may be
running in a different mount space or chroot, the stdlib may not have
access to the requested files.

Since the main purpose of this function is to support enumerating
cgroups, opendir(3) and closedir(3) aren't accessible from erlang
(similar to setns, but differing from support for open, close, read,
 write).
2014-04-05 14:08:23 -04:00
Michael Santos
891fa32170 Make SIGCHLD notifications optional 2014-04-05 09:43:51 -04:00
Michael Santos
32504ecece cli: mirror setopt/3
Toggle the exit_status and termsig switches on the command line using
an argument, similar to setopt/3.
2014-04-04 10:26:06 -04:00
Michael Santos
19f708cb45 prctl: remove the man page excerpt 2014-04-04 09:56:34 -04:00
Michael Santos
f010b148b2 Allow setting the number of child processes
The maximum number of forks a process can have is limited by the
available file descriptors, with the upper boundary set by the number of
fd's select() can handle.

This option is provided because the number of child process per fork
can't easily be done by other means. For example, setrlimit() with
RLIMIT_NPROC applies to all the user processes and RLIMIT_NFILE will
interfere with normal file operations. Useful for testing and for
preventing accidental fork bombs.

The maxchild options applies to child processes. The value in the
current process is unchanged.
2014-04-03 10:20:23 -04:00
Michael Santos
4aec915e20 Do not free the results of erl_iolist_to_binary()
http://erlang.org/doc/man/erl_eterm.html#erl_iolist_to_binary

The binary does not need to be freed.
2014-04-02 10:12:20 -04:00
Michael Santos
dd574a062d cleanup: prevent leaks on error
Free memory on error. Be cautious about errno being overwritten by
intervening functions.

Lots more cleanup to be done, for example, check for double free's.
2014-04-01 08:52:15 -04:00
Michael Santos
0a48267cb6 Account for the length of the message header
Since there may be multiple message headers, a read may result in a
valid packet that the next process in the fork chain rejects as too
large.

Calculate the message buffer size for the read based on the length of
the fork chain.
2014-03-31 14:36:48 -04:00
Michael Santos
e9903604f8 write: return the number of bytes written
write(3) may return success after writing less than the full buffer
supplied in the argument. From the Linux write(2) man page:

    The  number  of bytes written may be less than count if, for example,
    there is insufficient space on the underlying physical medium, or the
    RLIMIT_FSIZE resource limit is encountered (see setrlimit(2)), or the
    call was interrupted by a signal handler after  having  written  less
    than count bytes.  (See also pipe(7).)
2014-03-31 11:30:12 -04:00
Michael Santos
7fb49fe91a cgroups: add open(2), write(2), read(2), close(2)
Support syscalls required for manipulating cgroups. See cpuset(7).
2014-03-31 07:59:51 -04:00
Michael Santos
6b7b24f331 cgroups: add mkdir(2), rmdir(2)
Begin support of cgroups by adding mkdir and rmdir. Instead of providing
an inflexible interface to cgroups within the port, support the
primitives so that the logic can be done within erlang.

This functionality duplicates the file module in the standard library.
It is unfortunately necessary because the port requires superuser
privileges to write to the cgroup filesystem.
2014-03-31 07:59:51 -04:00
Michael Santos
a27694a000 Add environ(7), clearenv(3) 2014-03-30 09:43:00 -04:00
Michael Santos
3fb6bf4317 Add functions for manipulating the environment
The process environment can be set up in a few ways:

* using the {env, [{Key, Val}]} option in alcove_drv:start/1

  Global: will effect all future spawned processes.

* at exec, using execve/5

  Per process: will affect the child process

* using getenv(3), setenv(3) and unsetenv(3)
2014-03-29 18:29:13 -04:00
Michael Santos
6e85e6391d Add optional notification of child exit status
Exit status is disabled by default and can be enabled per process by
using setopt.

    Port = alcove_drv:start([exit_status, termsig]),
    {ok, Child1} = alcove:fork(Port),
    {ok, Child2} = alcove:fork(Port),

    ok = alcove:exit(Port, [Child1], 1),
    {exit_status, 1} = alcove:event(Port, [Child1]),

    ok = alcove:kill(Port, [Child2], 9),
    {termsig, 9} = alcove:event(Port, [Child2]).

The exit status is spoofed so the event will appear to come from the
child. The parent will still receive SIGCHLD ({signal,17} on linux).
2014-03-29 16:07:13 -04:00
Michael Santos
ef8f2f62b7 tests: add events
Add exit(3) for testing.

Confirm events are delivered properly, for example, to the parent after
the child has exited.

The event extraction in alcove_drv is still messy, redundant and error
prone. Needs to be cleaned up.
2014-03-29 15:30:45 -04:00
Michael Santos
8600ae82f0 Always treat stderr as a stream
Using exact reads for processes running the event loop broke stderr.
2014-03-29 13:13:11 -04:00
Michael Santos
f1ff3469a1 Rename alcove_ctl -> alcove_event_loop 2014-03-29 12:13:20 -04:00
Michael Santos
43369a15cd signal_constant: convert integer to atom 2014-03-29 11:53:58 -04:00
Michael Santos
e641b00891 setopt: remove setting maxchild per process
Accidentally the maxchild option. maxchild is used to size an array of
file descriptors. So obviously resetting this value will either crash
the process (read outside the bounds of the array) or cause it leak fd's.

So punt for now and remove support for changing maxchild. I guess the
simplest way to handle this would be to:

* get the new value of maxchild
* allocate an array of maxchild bytes
* copy PIDs from old to new
* if there is not enough space, free new array and return badarg
* otherwise, point the state to the new array and free the old array

Instead of crashing the caller by returning badarg, we could synthesize
an errno value and return {error,enomem} or simply leave the maxchild
unchanged and return ok.
2014-03-29 11:30:15 -04:00
Michael Santos
083256205a getopt: break after parsing opt 2014-03-29 10:53:54 -04:00
Michael Santos
bb9b98011c Add getopt/1: retrieve per process port options 2014-03-28 16:54:01 -04:00
Michael Santos
86ceca06b1 headers: mirror numbering of Unix stdio
Change the order of the message header types so that STDIN is 0, STDOUT
is 1 and STDERR is 3.
2014-03-27 14:50:51 -04:00
Michael Santos
cd551f4521 Retrieve long values from the external term format
Getting an integer value from erl_interface is tricky: the macros are
undocumented and the behaviour is not what you'd expect. A small value is
sent from the Erlang side as an integer. An unsigned integer is a value
exceeding a signed int. A long long is a value exceeding an unsigned
int, etc.

Test that the value is an integer and is one of the expected types. In
the case of setrlimit, if the executable is compiled with support for
large files, the value of cur/max are stored in 8 bytes.

If a long long is passed into cur/max using 32-bits, it should be
truncated.
2014-03-27 14:18:36 -04:00
Michael Santos
27d8880926 clone: correct signedness of flags 2014-03-27 14:00:37 -04:00
Michael Santos
c9636a8e5c fork: use a unsigned integer to track depth
The intent of using a signed was to allow disabling fork depth but it's
simpler to use an unsigned integer and set a high value or set it to -1
and let the value overflow and wrap.
2014-03-27 13:40:20 -04:00
Michael Santos
8eaa8975b0 maxforkdepth: control the length of fork chains 2014-03-27 09:13:30 -04:00
Michael Santos
b83b905b6c mount: fix pasto in define 2014-03-26 15:34:45 -04:00
Michael Santos
0dd6ca4dcd mount: add portable mount flags 2014-03-26 15:32:17 -04:00
Michael Santos
90e0f7679d prctl: set length of binary in arg 2014-03-26 14:23:05 -04:00
Michael Santos
04b5277555 Define constants as atoms
Instead of using a case statement, simplify the lookup by searching
through an array.

Do not modify the name of constants: previously, the names were
downcased and the leading namespace removed, e.g, PR_CAPBSET_READ
would become 'get_capbset_read' and SIG_TERM would be represented by
'term'.  Constants are now atoms mirroring the name in the header file:
'PR_CAPBSET_READ', 'SIG_TERM'.

This change will cause issues with non-portable naming of constants. For
example, Linux uses MS_ to preface mount constants and FreeBSD uses
MNT_. A future change will add portable representations of the atom.
Something like 'rdonly' for MS_RDONLY and MNT_RDONLY.
2014-03-26 14:13:12 -04:00
Michael Santos
0727a2d064 getopt: limit maximum number of child procs 2014-03-26 09:59:15 -04:00
Michael Santos
c6191a0334 Allow setting max number of children per fork 2014-03-25 08:27:22 -04:00
Michael Santos
3f2de7e295 setopt/3: set options per process 2014-03-25 08:08:30 -04:00
Michael Santos
0fbd6ce289 exec: free env on badarg 2014-03-24 09:32:53 -04:00
Michael Santos
f3df938969 Set the exec() status of the child
Set the exec() status of the child before reading from the child's
stdin/stderr: the exec status determines the message type (proxy or
stdout) returned by the port.

Begin recording the exit status of the child. Currently it is just a
non-zero number. It should be set to the actual exit status of the
child. The value should be reported to the erlang side as well:

    {exit_status, integer()}
2014-03-23 16:06:32 -04:00
Michael Santos
36e890b148 Merge branch 'execve' 2014-03-23 15:11:38 -04:00