Use Case

Many times data scientists care not merely the computation but also economy efficiency of running analytical jobs on cloud. It is therefore useful to have a monitoring tool to obtain data consumption and total expense for using Azure DSVMs. This vignette will show how to achieve this with AzureDSR consumptionCalculator function.

Setup

In this tutorial, we assume that there is at least one DSVM deployed in a resource group, and this DSVM has been used for certain period of time.

Similar to the previous sections, credentials for authentication are required.

# Load the required subscription resources: TID, CID, and KEY.
# Also includes the ssh PUBKEY for the user.

USER <- Sys.info()[['user']]

source(paste0(USER, "_credentials.R"))
# Load the required packages.

library(AzureSMR)    # Support for managing Azure resources.
library(AzureDSVM)    # Further support for the Data Scientist.
library(magrittr)    
library(dplyr)

Data consumption

Availability of resource consumption information is significate to cloud users as this will make it convenient to plan the use of cloud resources wisely. The function of dataConsumptionDSVM in AzureDSVM is helpful in obtaining data consumption of a DSVM instance during a certain period of time.

The basic information for getting data consumption include

Following are the information needed to obtain data consumption of a DSVM named "dsvm". Let's assume the DSVM is one that has been deployed from the previous sections, and it has run for a while.

# not-run

# VM     <- "dsvm_name"
# START  <- "starting_time_point" # in the format of YYYY-MM-DD HH:MM:SS.
# END    <- "ending_time_point"   # in the format of YYYY-MM-DD HH:MM:SS
# GRA    <- "Daily"
VM     <- "dlzd"
START  <- "2016-09-01 00:00:00" # in the format of YYYY-MM-DD HH:MM:SS.
END    <- "2017-06-01 00:00:00"   # in the format of YYYY-MM-DD HH:MM:SS
GRA    <- "Daily"

Get data consumption of the DSVM.

# authentication with Azure account.

context <- createAzureContext(tenantID=TID, clientID=CID, authKey=KEY)
# get data consumption of instance.

data_consum <- dataConsumptionDSVM(context,
                                   hostname=VM,
                                   time.start=START,
                                   time.end=END,
                                   granularity=GRA)

print(data_consum)

The data consumption is often used to calculate the expense spent on the DSVM for doing analytical tasks. Retrieval of expense can be done with costDSVM function. The calculation is based on price rates of DSVM components, which are multiplied by data consumption during a given time period.

Basic information for expense calculation include

# not-run

# CURR   <- "your_currency"
# LOCALE <- "locale_of_the_azure_subscription"
# REG    <- "region_of_the_azure_subscription"
# OFFER  <- "a_valid_offer_id"
CURR   <- "USD"
LOCALE <- "en-SG"
REG    <- "SG"
OFFER  <- "MS-AZR-0015P"

Again assuming the information is pre-stored into the "credential" script.

consum <- costDSVM(context,
                   hostname=VM,
                   time.start=START,
                   time.end=END,
                   granularity=GRA,
                   currency=CURR,
                   locale=LOCALE,
                   offerId=OFFER,
                   region=REG)

print(consum)


Azure/AzureDSVM documentation built on May 20, 2019, 2:03 p.m.