Load VISDOM module and instantiate your customer data source.

library(visdom)
library(visdomloadshape)
# First implement your own DataSource with method implementations of DataSource interface methods
# Instantiate your data source
DATA_SOURCE = TestData()
  1. load 1 hour aligned data

Hourly aligned data has the following format: 365 rows per customer, each covering the same 1 year period, with columns: meter ID, customer ID, date, 24 observation columns by hour of day, starting with the hour ending at 1AM

hourlyAlignedData = DATA_SOURCE$getHourlyAlignedData(n=10)
  1. create a load shape dictionary (via adaptive K-means clustering) WARNING: this will run for a long time if you have several thousand customers and you may want to sub-sample your data to reduce run times!
dict100 = create_dictionary(hourlyAlignedData[,5:28], target.size=10, mode=1, d.metric=1, ths=0.2, iter.max=100, nstart=1, two.step.compress=F, verbose=F)
  1. More useful: create dictionary and shape encodings together Returned as a list of the form: list(clean.data = cdata, dictionary = dic, encoded.data = encoded)
encodedOut    = raw2encoded(hourlyAlignedData[,5:28], use.all = T, s.size = 100000, target.size=100, mode=1, d.metric=1, ths=0.2, iter.max=100, nstart=1, two.step.compress=F,verbose=F)
dict100       = encodedOut$dictionary
encodings100  = encodedOut$encoded.data # first colum: encoding of all load shapes, 2nd: total daily energy, 3rd: daily sum of squared errors, 4th: relative square error
  1. plot top 10 load shapes
draw.top.n.shapes(encodings100[,1], en.s=encodings100[,1], dic=dict100, n=10, show.avg=0)
  1. calculate entropy of load shapes (metric of how variable load shapes are for each customer)
cust1Id = unique(hourlyAlignedData[,1])[1] # first customer ID
cust1Rows = hourlyAlignedData[,1] == cust1Id # identify the rows of the aligned data and therefore encodings that belong to the first customer
cust1Entropy = shannon.entropy(encodings100[cust1Rows,1])

Other options, rmd TODOs:

shapeFeatures = shapeFeatures(shapeCategoryEncoding(rawData=hourlyAlignedData, metaCols=1:4, encoding.dict=dict100 ))

ctx=new.env()
ctx$fnVector = c(visdom::basicFeatures)
featureRunRaw = iterator.iterateMeters( DATA_SOURCE$getIds(), visdom::iterator.callAllFromCtx, ctx = ctx)
features = visdom::iterator.todf( featureRunRaw )

exportFeatureAndShapeResults( feature.data=features, shape.results.data=shapeFeatures, format='csv', filePath='out' )
# loop through all customers and build an array of entropy values to put into a histogram.

# us the same data to produce the average energy / entropy scatter

# define some segmentation criteria ???


ConvergenceDA/visdomloadshape documentation built on May 8, 2019, 8:34 a.m.