msm | R Documentation |
This is the main function to perform Monte Carlo simulations of Markov Chain on the dynamic forecasting of HVT States of a time series dataset. It includes both ex-post and ex-ante analysis offering valuable insights into future trends while resolving state transition challenges through clustering and nearest-neighbor methods to enhance simulation accuracy.
msm(
state_time_data,
forecast_type = "ex-post",
initial_state,
n_ahead_ante,
transition_probability_matrix,
num_simulations = 100,
trainHVT_results,
scoreHVT_results,
actual_data = NULL,
raw_dataset,
k = 5,
handle_problematic_states = FALSE,
n_nearest_neighbor = 1,
show_simulation = TRUE,
mae_metric = "median",
time_column = NULL,
plot_type = "static"
)
state_time_data |
DataFrame. A dataframe containing state transitions over time(cell id and timestamp) |
forecast_type |
Character. A character to indicate the type of forecasting. Accepted values are "ex-post" or "ex-ante". |
initial_state |
Numeric. An integer indicatiog the state at t0. |
n_ahead_ante |
Numeric. A vector of n ahead points to be predicted further in ex-ante analyzes. |
transition_probability_matrix |
DataFrame. A dataframe of transition probabilities/ output of 'getTransitionProbability' function |
num_simulations |
Integer. A number indicating the total number of simulations to run. Default is 100. |
trainHVT_results |
List.'trainHVT' function output |
scoreHVT_results |
List. 'scoreHVT' function output |
actual_data |
Dataframe. A dataFrame for ex-post prediction period with teh actual raw data values |
raw_dataset |
DataFrame. A dataframe of input raw dataset from the mean and standard deviation will be calculated to scale up the predicted values |
k |
Integer. A number of optimal clusters when handling problematic states. Default is 5. |
handle_problematic_states |
Logical. To indicate whether to handle problematic states or not. Default is FALSE. |
n_nearest_neighbor |
Integer. A number of nearest neighbors to consider when handling problematic states. Default is 1. |
show_simulation |
Logical. To indicate whether to show the simulation lines in plots or not. Default is TRUE. |
mae_metric |
Character. A character to indicate which metric to calculate Mean Absolute Error. Accepted entries are "mean", "median", or "mode". Default is "median". |
time_column |
Character. The name of the column containing time data. Used for aligning and plotting the results. |
plot_type |
Character. A character to indicate what type of plot should be generated. Accepred entries are "static" (ggplot object) or "interactive"(plotly object). Default is "static". |
A list object that contains the forecasting plots and MAE values.
[[1]] |
Simulation plots and MAE values for state and centroids plot |
[[2]] |
Summary Table, Dendogram plot and Clustered Heatmap when handle_problematic_states is TRUE |
Vishwavani <vishwavani@mu-sigma.com>
dataset <- data.frame(t = as.numeric(time(EuStockMarkets)),
DAX = EuStockMarkets[, "DAX"],
SMI = EuStockMarkets[, "SMI"],
CAC = EuStockMarkets[, "CAC"],
FTSE = EuStockMarkets[, "FTSE"])
hvt.results<- trainHVT(dataset[,-1],n_cells = 60, depth = 1, quant.err = 0.1,
distance_metric = "L1_Norm", error_metric = "max",
normalize = TRUE,quant_method = "kmeans")
scoring <- scoreHVT(dataset, hvt.results)
cell_id <- scoring$scoredPredictedData$Cell.ID
time_stamp <- dataset$t
temporal_data <- data.frame(cell_id, time_stamp)
table <- getTransitionProbability(temporal_data,
cellid_column = "cell_id",time_column = "time_stamp")
colnames(temporal_data) <- c("Cell.ID","t")
ex_post_forecasting <- dataset[1800:1860,]
ex_post <- msm(state_time_data = temporal_data,
forecast_type = "ex-post",
transition_probability_matrix = table,
initial_state = 2,
num_simulations = 100,
scoreHVT_results = scoring,
trainHVT_results = hvt.results,
actual_data = ex_post_forecasting,
raw_dataset = dataset,
mae_metric = "median",
show_simulation = FALSE,
time_column = 't')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.