This is a package with a single function that uses the server used by the ‘rmote’ package to send data objects over the network to be viewed using the RStudio viewer in a local session.
In order to use this package you must first install two packages,
servr
and rmote
, into your remote R library (i.e., the one on the
compute cluster). servr
is on CRAN and can be installed with: be
installed with:
install.packages("servr")
rmote
is not on CRAN (and may not ever be), so you must install it
with:
remotes::install_github("cloudyr/rmote")
There might be a few dependencies that are not automatically installed, and so it might take several round of installation to finally get those two packages installed. Once they are installed, however, you can then install this ‘rView’ package on both your local and remote computers, by doing this on both of them:
remotes::install_github("eriqande/rView")
Here is what a session looks like on a compute cluster. Some of this is a workaround because the compute nodes may not be free to do port forwarding, but the head node(s) typically can.
From RStudio terminal, login to the head node on the cluster
From RStudio terminal get a shell on compute node (srun --pty
/bin/bash
of sinteractive
, etc., depending on your cluster), and
_start an R session by typing R
at the command line of the
compute node.
In the R session on the remote machine, start the ‘rmote’ server,
but make sure that it is writing to a directory in your home
directory. (This is necessary so we can relay the contents of that
directory from a port on the head node. The defaut ‘rmote’ directory
location in /tmp
is accessible only by the compute node, not the
head node.) This directory can be on any storage that is shared by
the different nodes in the cluster. (So, for example, you could put
it in your scratch
area, if desired. For example, for me, I might
use, the directory /home/eanderson/rmote_server
. Also you need to
use a port number. The default port number is 4321, but you should
choose a different one to make sure you don’t collide with another
user. Here I will use 4111, but you should use a different number,
let us say in the range 4000 to 5000.
sh
library(rmote)
start_rmote(server_dir = "/home/eanderson/rmote_server", port = 4111)
Once you have started the ‘rmote’ web server from your remote R session, you can then load the ‘rView’ package with:
r
library(rView)
Now, back on your local machine, in a separate terminal (it does not need to be in RStudio…you could use the Terminal app, or iTerm on a Mac), login to the head node, while enabling port forwarding from your port number (4111 in the example) on the head node to the same port number on your local computer. That looks like this
``` sh ssh -L 4111:localhost:4111 eanderson@sedna.nwfsc2.noaa.gov
rmote::start_rmote(server_dir = "/home/eanderson/rmote_server", port = 4111) ```
This starts R on the head node, and within that R session a small
web server is run that serves up data found in the server_dir
on
port 4111. This is not a compute-intensive job, so the sysadmins
should not be upset that this is running on the head node.
Also, back on your local machine, open a browser window (like Chrome
or Safari) and go to the address of the port number on your local
machine. In our example, the port is 4111 so you must go to:
http://localhost:4111/
. This is where rmote
will serve up plots
and help screens.
This is very straightforward. The ‘rmote’ package, when invoked on the
remote machine, overwrites the standard print
function for base plots,
ggplots, and the help function. So, if you print a ggplot2 object,
rather than trying to print on the remote machine and finding nowhere to
print it, when you are running ‘rmote’, the object will be printed to
the server_dir
location and served to your local web browser via ssh
port forwarding. You can try it like this:
First, do:
?rmote::start_rmote
on the remote machine; then check you web browser that is pointed to
http://localhost:4111/
.
Second, you might try printing a ggplot object in the R session on the remote machine:
library(tidyverse)
tibble(x = 1:10, y = (1:10)^2) %>%
ggplot(aes(x = x, y = y)) +
geom_point(colour = "blue", size = 2) +
geom_line()
That will cause the plot to be printed to the web page on the browser.
When all is set up as above, we can send objects to our local RStudio data viewer like this:
# on remote R session
rView(mtcars)
# then, on local session
rView(mtcars)
# on remote R session
library(tidyverse)
diamonds %>%
filter(cut == "Ideal") %>%
rView()
# then, on local session get it by:
rView(.)
# or by piping anything into rView()...
rview_pass_through
optionThere might be times when you would like to add rView()
commands to
the actual R code blocks in an RMarkdown document. For example, if you
are working on an RMarkdown document, then you might have an R code
black that creates a new data object (a tibble maybe) called
my_result
. To print that to the RMarkdown document, you might end the
code block with a line that says:
my_result
Doing so would cause the my_result
tibble to be printed in the
RMarkdown document.
If, instead, you wrote in that code block:
rView(my_result)
then, it would be quite easy, when working through the code blocks on a
remote machine (and the local machine) to send that code to the remote
and the local R interpreters in order to view the remote data object on
the local RStudio data object browser. However, what if you actually
just want to knit the RMarkdown document? That is where the
rview_pass_through
option comes into play.
If you set that option to TRUE, with:
options(rview_pass_through = TRUE)
Then the rView
commands will be ignored, and effectively it will just
default-print the object in the console/RMarkdown document.
You can restore the regular behavior of rView
with:
options(rview_pass_through = NULL)
To keep from having to type the start_rmote
command multiple times,
you can put that command (with your preferred port) in your .Rprofile
.
Then you would always be in rmote
mode when you start R on your remote
machine. But then that could screw you up if you are running R in batch
mode. So it is probably better to just define a function in your
.Rpofile
that is easy to type, like:
rmote <- function(server_dir = "~/rmote_server", port = 4949) {
rmote::start_rmote(server_dir = server_dir, port = port)
}
that includes the server_dir and the port that you will always want to
use. Then, when you enter R, you just have to type rmote()
, but if you
wanted to change the server_dir and the port, you still could.
At some point you might have a lot of old thumbnails and things in the
directory you are using as your server_dir
. No problem, just navigate
inside that directory on the Unix command line on the remote machine and
do:
rm -r .idx *
Be careful with that command: if you give it in the wrong directory, it
will delete all your files. The .idx
is there to explicitly remove the
hidden file .idx
, so that rmote will start renumbering files starting
from 1 again.
This can be done at any time, even if rmote is running.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.