h2odyssey

An R package to deploy h2o on the Harvard Odyssey cluster

Installation of R packages on Odyssey

Create the folder apps/R in your home folder

mkdir apps
mkdir apps/R

Modify your .bashrc file as follows:

# .bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi

# User specific aliases and functions
source new-modules.sh

module load R

# replace username with your username.  You may have to change home03 too.
export R_LIBS_USER=/n/home03/username/apps/R:$R_LIBS_USER

Installation of h2odyssey

In an R session:

devtools::install_github("NSAPH/h2odyssey")

Running h2odyssey on Harvard Odyssey

You first need to start a SLURM job.

# Example: we request a total of 2GB on 2 nodes with 20 cores per node in the shared partition
# srun -p shared  --mem 2g -t 0-06:00 -c 20 -N 2 --pty /bin/bash

# Example: we request a total of 300GB on 2 nodes with 32 cores per node in the bigmem partition
# srun -p bigmem --pty --mem 300g -t 0-06:00  -c 32 -N 2 /bin/bash

TODO: screen

To use screen:

# srun -p shared --pty --mem 2g -t 0-06:00 -c 20 -N 2 R

From a compute node:

library(h2odyssey)
# memory per node in GB
# start and connect to an h2o cluster with 2GB of RAM per node
start_h2o_cluster(memory = 2) 
# Code here...
h2o.shutdown()

You can try h2o examples:

https://github.com/h2oai/h2o-tutorials/blob/master/h2o-open-tour-2016/chicago/intro-to-h2o.R

https://github.com/h2oai/h2o-tutorials/blob/master/tutorials/ensembles-stacking/stacked_ensemble_h2o_xgboost.Rmd



NSAPH/h2odyssey documentation built on May 20, 2019, 9:59 a.m.