library(subprocess) library(knitr) knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
Since R is not really a systems-programming language[^systemslanguage] some facilities present in other such languages (e.g. C/C++, Python) haven't been yet brought to R. One of such features is process management which is understood here as the capability to create, interact with and control the lifetime of child processes.
The R package subprocess
aims at filling this gap by providing the few
basic operations to carry out the aforementioned tasks. The
spawn_subprocess()
function starts a new child process and returns a
handle which can be later used in process_read()
and process_write()
to send and receive data or in process_wait()
or process_terminate()
to stop the such a process.
The R subprocess
package has been designed after the exemplary
Python package which goes by the same. Its documentation can be found
here and numerous
examples of how it can be used can be found on the Web.
The R subprocess
package has been verified to run on Linux,
Windows and MacOS.
[^systemslanguage]: "By systems programming I mean writing code that directly uses hardware resources, has serious resource constraints, or closely interacts with code that does." Bjarne Stroustrup, "The C++ Programming Language"
The main concept in the package is the handle which holds process identifiers and an external pointer object which in turn is a handle to a low-level data structure holding various system-level parameters of the running sub-process.
A child process, once started, runs until it exits on its own or until its killed. Its current state as well as its exit status can be obtained by dedicated API.
Communication with the child process can be carried out over the three standard streams: the standard input, standard output and standard error output. These streams are intercepted on the child process' side and redirected into three anonymous pipes whose other ends are held by the parent process and can be accessed through the process handle.
In Linux these are regular pipes created with the pipe()
system call
and opened in the non-blocking mode. All communication takes place
on request and follows the usual OS rules (e.g. the sub-process will
sleep if its output buffer gets filled).
In Windows these pipes are created with the CreatePipe()
call and
opened in the blocking mode. Windows does not support
non-blocking (overlapped in Windows-speak) mode for anonymous
pipes. For that reason each stream has an accompanying reader thread.
Reader threads are separated from R interpreter, do not exchange memory
with the R interpreter and will not break the single-thread assumption
under which R operates.
Before we move on to examples, let's define a few helper functions that abstract out the underlying operating system. We will use them throughout this vignette.
is_windows <- function () (tolower(.Platform$OS.type) == "windows") R_binary <- function () { R_exe <- ifelse (is_windows(), "R.exe", "R") return(file.path(R.home("bin"), R_exe)) }
Just for the record, vignette has been built in
r ifelse(is_windows(), "Windows", "Linux")
.
ifelse(is_windows(), "Windows", "Linux")
Now we can load the package and move on to the next section.
library(subprocess)
In this example we spawn a new R process, send a few commands to its standard input and read the responses from its standard output. First, let's spawn the child process (and give it a moment to complete the start-up sequence[^syssleep]):
handle <- spawn_process(R_binary(), c('--no-save')) Sys.sleep(1)
[^syssleep]: Depending on the system load, R can take a few seconds
to start and be ready for input. This is true also for other processes.
Thus, you will see Sys.sleep()
following spawn_process()
in almost
every example in this vignette.
Let's see the description of the child process:
print(handle)
And now let's see what we can find it the child's output:
process_read(handle, PIPE_STDOUT, timeout = 1000) process_read(handle, PIPE_STDERR)
The first number in the output is the value returned by process_write
which is the number of characters written to standard input of the
child process. The final character(0)
is the output read from the
standard error stream.
Next, we create a new variable in child's session. Please notice the new-line character at the end of the command. It triggers the child process to process its input.
process_write(handle, 'n <- 10\n') process_read(handle, PIPE_STDOUT, timeout = 1000) process_read(handle, PIPE_STDERR)
Now it's time to use this variable in a function call:
process_write(handle, 'rnorm(n)\n') process_read(handle, PIPE_STDOUT, timeout = 1000) process_read(handle, PIPE_STDERR)
Finally, we exit the child process:
process_write(handle, 'q(save = "no")\n') process_read(handle, PIPE_STDOUT, timeout = 1000) process_read(handle, PIPE_STDERR)
The last thing is making sure that the child process is no longer alive:
process_state(handle) process_return_code(handle)
Of course there is little value in running a child R process since there
are multiple other tools that let you do that, like parallel
, Rserve
and opencpu
to name just a few. However, it's quite easy to imagine
how running a remote shell in this manner enables new ways of
interacting with the environment. Consider running a local shell:
shell_binary <- function () { ifelse (tolower(.Platform$OS.type) == "windows", "C:/Windows/System32/cmd.exe", "/bin/sh") } handle <- spawn_process(shell_binary()) print(handle)
Now we can interact with the shell sub-process. We send a request to list the current directory, then give it a moment to process the command and produce the output (and maybe finish its start-up, too). Finally, we check its output streams.
process_write(handle, "ls\n") Sys.sleep(1) process_read(handle, PIPE_STDOUT) process_read(handle, PIPE_STDERR)
If the child process needs to be terminated one can choose to:
process_write()
SIGTERM
(Linux, Windows)SIGKILL
(Linux only)Assume the child R process is hung and there is no way to stop it
gracefully. process_wait(handle, 1000)
waits for 1 second (1000
milliseconds) for the child process to exit. It then returns NA
and
process_terminate()
gives R
a chance to exit graceully. Finally,
process_kill()
forces it to exit.
sub_command <- "library(subprocess);subprocess:::signal(15,'ignore');Sys.sleep(1000)" handle <- spawn_process(R_binary(), c('--slave', '-e', sub_command)) Sys.sleep(1) # process is hung process_wait(handle, 1000) process_state(handle) # ask nicely to exit; will be ignored in Linux but not in Windows process_terminate(handle) process_wait(handle, 1000) process_state(handle) # forced exit; in Windows the same as the previous call to process_terminate() process_kill(handle) process_wait(handle, 1000) process_state(handle)
We see that the child process remains running until it receives the
SIGKILL
signal[^signal]. The final return code (exit status) is the
number of the signal that caused the child process to exit[^status].
[^termination]: In Windows, process_terminate()
is an alias for
process_kill()
. They both lead to immediate termination of the child
process.
[^signal]: The .Call("C_signal")
in our example is a call to a hidden
C function that subprocess
provides mainly for the purposes of this
example.
[^status]: See the waitpid()
manual page, e.g. here.
The last topic we want to cover here is sending an arbitrary[^windowssignals]
signal to the child process. Signals can be listed by looking at the
signals
variable present in the package. It is constructed
automatically when the package is loaded and its value on Linux is
different than on Windows. In the example below we see the first
three elements of the Linux list of signals.
length(signals) signals[1:3]
All possible signal identifiers are supported directly from the
subprocess
package. Signals not supported on the current platform
are set to NA
and the rest have their OS-specific numbers as their
values.
ls(pattern = 'SIG', envir = asNamespace('subprocess'))
Now we can create a new child process and send an arbitrary using its handle.
handle <- spawn_process(R_binary, '--slave') process_send_signal(handle, SIGUSR1)
[^windowssignals]: The list of signals supported in Windows is much
shorter than the list of signals supported in Linux and contains the
following three signals: SIGTERM
, CTRL_C_EVENT
and CTRL_BREAK_EVENT
.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.