Azure Runner R package : scientist-friendly functions for creating Azure resources for running R code
July 2020*
This is package provides convenience methods around standard cloud practices for running R code (specifically Bayesian models) on the Azure Cloud. It's a simple package designed for scientists to easily provision and do work on cloud computers using R functions within R studio with some pre-configured conventions.
This is in very early stages of development and some R scripts/functions are not complete and just ideas or stubs. It remains to be seen if this package will be easier to use than simply running the Aure functions from the AzureR library collection directly
Currently this package is in early stages and does not have much functionality. This section is for development only and will need to be re-written as the package is developed
Requirements
Getting Started
.Renviron
in the root folder (note the starting character is
a dot/period). example-Renviron
to .Renviron
to start AZUREUSER=<your azure id>
AZURESUB=<azure subscription id>
AZURERG=<resource group this will primarily be used with>
Note: The .Renviron
file is read when you start R and creates environment variables you can access from your R session. See https://rstats.wtf/r-startup.html for a good description.
renv
package with install.packages(renv)
and then use
renv to install all the packages necessary for this project with renv::init()
After setting the Azure subscription (AZURESUB) and resource group (AZURERG) either in .Renviron or in your OS environment, try the following in the R console
devtools::build() # build the package
library(azrunr) # load the package
set_azure_options() # set R options for Azure subscription. Triggers an Azure log-in if necessary
allResourceNames() # should list all the items in resource group defined above
names(storageAccounts()) # should list storage accounts if any
additionally, developers or users of this package with access to multiple resource groups can select a different resource group to work within, overriding what's in .Renviron:
azuresub <- get_sub() # gets a subscription 'object' (see )
rgroups <- azuresub$list_resource_groups() # pull all resource groups in the subscription
names(rgroups) # list the all the resource group names
first_group <- names(rgroups)[1] # get the first group in the list
first_group # and show it
set_azure_options(azurerg=first_group) # change default resource group
current_rg <- get_rg() # get a resource group object
current_rg$name # should be the same
There is another set of options within R for working with storage.
Theses parameters can be set in the .Renviron file:
AZURESTOR=<your azure storage account name>
AZURECONTAINER=<your azure container name>
STORAGEACCESSKEY=<your azure storage access key>
You will then want to run the following to set the R options:
set_storage_options() # set R options for Azure storage
The R options can alternatively be set with the following:
set_storage_options(azurestor, azurecontainer, storageaccesskey) # set R options for Azure storage
A VM can be launched in different ways depending on the resources available and the intended deployment.
Launch VM with provided shell script extension
The following can be used to create a vm, then run the shell script on the VM.
set_azure_options() # set the r options for subscription and resource group
set_azure_storage() # set the r options for storage account, container and storage access key
vm_from_template(vmName, templateFile, shellScript, adminPasswordOrKey, userPassword, cpuSize, ubuntuOSVersion)
Parameters: - vmName: the name of the VM, also used as a prefix on all other related resources created during deployment - templateFile: the file path to the template json used to deploy the VM and other resources. There is a template provided at inst/VM_From_Template/azuredeploy.json - shellScript: the file path to the extension script file. There is a file provided at inst/VM_From_Template/installrstudio_ubuntu20.sh - adminPasswordOrKey: ssh public key used to access the vm through ssh - userPassword: Rstudio password - cpuSize: the size of the cpu, one of the following list ("CPU-4GB", "CPU-7GB", "CPU-8GB", "CPU-14GB", "CPU-16GB", "GPU-56GB") - ubuntuOSVersion: the Ubuntu version of the VM, one of the following list ("18.04-LTS", "20_04-lts", "20_04-daily-lts-gen2")
Accessing the launched VM
The deployment can take a while (10 minutes) to complete depending on the content of the extension.
The deployment status can be found in the Azure Portal under the resource group -> Deployments. It will be listed with the name provided for the vmName parameter.
The VM can then be accessed through rstudio server by pasting the IP address found on the portal with ":8787" following in the browser in this format: "xx.xxx.xxx.xx:8787".
You can then login to rstudio server with the userName and userPassword that were used to launch the vm.
NOTE: the userName is default to the azureuser option set in the .Renviron or with set_azure_options().
This could become more flexible to use any template for deployment in the future.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.