msm: Performing Monte Carlo Simulations of Markov Chain

View source: R/msm.R

msmR Documentation

Performing Monte Carlo Simulations of Markov Chain

Description

This is the main function to perform Monte Carlo simulations of Markov Chain on the dynamic forecasting of HVT States of a time series dataset. It includes both ex-post and ex-ante analysis offering valuable insights into future trends while resolving state transition challenges through clustering and nearest-neighbor methods to enhance simulation accuracy.

Usage

msm(
  state_time_data,
  forecast_type = "ex-post",
  initial_state,
  n_ahead_ante,
  transition_probability_matrix,
  num_simulations = 100,
  trainHVT_results,
  scoreHVT_results,
  actual_data = NULL,
  raw_dataset,
  k = 5,
  handle_problematic_states = FALSE,
  n_nearest_neighbor = 1,
  show_simulation = TRUE,
  mae_metric = "median",
  time_column = NULL,
  plot_type = "static"
)

Arguments

state_time_data

DataFrame. A dataframe containing state transitions over time(cell id and timestamp)

forecast_type

Character. A character to indicate the type of forecasting. Accepted values are "ex-post" or "ex-ante".

initial_state

Numeric. An integer indicatiog the state at t0.

n_ahead_ante

Numeric. A vector of n ahead points to be predicted further in ex-ante analyzes.

transition_probability_matrix

DataFrame. A dataframe of transition probabilities/ output of 'getTransitionProbability' function

num_simulations

Integer. A number indicating the total number of simulations to run. Default is 100.

trainHVT_results

List.'trainHVT' function output

scoreHVT_results

List. 'scoreHVT' function output

actual_data

Dataframe. A dataFrame for ex-post prediction period with teh actual raw data values

raw_dataset

DataFrame. A dataframe of input raw dataset from the mean and standard deviation will be calculated to scale up the predicted values

k

Integer. A number of optimal clusters when handling problematic states. Default is 5.

handle_problematic_states

Logical. To indicate whether to handle problematic states or not. Default is FALSE.

n_nearest_neighbor

Integer. A number of nearest neighbors to consider when handling problematic states. Default is 1.

show_simulation

Logical. To indicate whether to show the simulation lines in plots or not. Default is TRUE.

mae_metric

Character. A character to indicate which metric to calculate Mean Absolute Error. Accepted entries are "mean", "median", or "mode". Default is "median".

time_column

Character. The name of the column containing time data. Used for aligning and plotting the results.

plot_type

Character. A character to indicate what type of plot should be generated. Accepred entries are "static" (ggplot object) or "interactive"(plotly object). Default is "static".

Value

A list object that contains the forecasting plots and MAE values.

[[1]]

Simulation plots and MAE values for state and centroids plot

[[2]]

Summary Table, Dendogram plot and Clustered Heatmap when handle_problematic_states is TRUE

Author(s)

Vishwavani <vishwavani@mu-sigma.com>

Examples

dataset <- data.frame(t = as.numeric(time(EuStockMarkets)),
DAX = EuStockMarkets[, "DAX"],
SMI = EuStockMarkets[, "SMI"],
CAC = EuStockMarkets[, "CAC"],
FTSE = EuStockMarkets[, "FTSE"])
hvt.results<- trainHVT(dataset[,-1],n_cells = 60, depth = 1, quant.err = 0.1,
                      distance_metric = "L1_Norm", error_metric = "max",
                      normalize = TRUE,quant_method = "kmeans")
scoring <- scoreHVT(dataset, hvt.results)
cell_id <- scoring$scoredPredictedData$Cell.ID
time_stamp <- dataset$t
temporal_data <- data.frame(cell_id, time_stamp)
table <- getTransitionProbability(temporal_data, 
cellid_column = "cell_id",time_column = "time_stamp")
colnames(temporal_data) <- c("Cell.ID","t")
ex_post_forecasting <- dataset[1800:1860,]
ex_post <- msm(state_time_data = temporal_data,
              forecast_type = "ex-post",
              transition_probability_matrix = table,
              initial_state = 2,
              num_simulations = 100,
              scoreHVT_results = scoring,
              trainHVT_results = hvt.results,
              actual_data = ex_post_forecasting,
              raw_dataset = dataset,
              mae_metric = "median",
             show_simulation = FALSE,
             time_column = 't')

HVT documentation built on April 3, 2025, 8:45 p.m.