rpart_fit: Determine model fit of RPART tree

Description Usage Arguments Value Examples

View source: R/INTRA_FORECAST_rpart_fit.R

Description

rpart_fit A function to gauge the fit of a model run of an RPART tree, given parameters. This is a function that is used to fine-tune the RPART tree when forecasting

Usage

1
rpart_fit(ML_data, minsplit, maxdepth, cp, xval)

Arguments

ML_data

Dataset that has been prepared to run through RPART. If originally a time series object, then it has gone through the decompose_ts_object_for_ML function and the first difference of the column of interest has been taken

minsplit

RPART parameter. The minimum number of observations that must exist in a node in order for a split to be attempted (default from RPART = 20)

maxdepth

RPART parameter. The maximum depth of any node in the tree (default from RPART = 30)

cp

RPART parameter. Determines the minimum amount of increase in R-squared that is needed for a node to split (default from RPART = 0.1)

xval

RPART parameter. Number of cross validations run. This is important as it reduces the tendency to over-fit (default from RPART = 10)

Value

The mean absolute prediction error (MAPE), in percentage terms, of the model run

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
ML_data <- tstools::initialize_ts_forecast_data(
   data = dummy_gasprice, 
      date_col = "year_month", 
      col_of_interest = "gasprice", 
      group_cols = c("state", "oil_company"), 
      xreg_cols = c("spotprice", "gemprice")
   ) %>% 
   dplyr::filter(grouping == "state = New York   &   oil_company = CompanyA") %>% 
   tstools::transform_data_to_ts_object() %>% 
   decompose_ts_object_for_ML() %>% 
   dplyr::mutate(col_of_interest = col_of_interest - dplyr::lag(col_of_interest)) %>% 
   dplyr::filter(!is.na(col_of_interest))
rpart_fit(
   ML_data = ML_data, 
   minsplit = 20, 
   maxdepth = 30, 
   cp = 0.01,
   xval = 10
)

ing-bank/tsforecast documentation built on Sept. 18, 2020, 9:40 a.m.