detrend: Removes trend from data

View source: R/data_preprocessing.R

detrendR Documentation

Removes trend from data

Description

Takes a list of train and application data as prepared by split_data_counterfactual() and removes a polynomial, exponential or cubic spline spline trend function. Trend is obtained only from train data. Use as part of preprocessing before training a model based on decision trees, i.e. random forest and lightgbm. For the other methods it may be helpful but they are generally able to deal with trends themselves. Therefore we recommend to try out different versions and guide decisisions using the model evaluation metrics from calc_performance_metrics().

Usage

detrend(split_data, mode = "linear", num_splines = 5, log_transform = FALSE)

Arguments

split_data

List of two named dataframes called train and apply

mode

String which defines type of trend is present. Options are "linear", "quadratic", "exponential", "spline", "none". "none" returns original data

num_splines

Defines the number of cubic splines if mode="spline". Choose num_splines=1 for cubic polynomial trend. If mode!="spline", this parameter is ignored

log_transform

If TRUE, use a log-transformation before detrending to ensure positivity of all predictions in the rest of the pipeline. A exp transformation is necessary during retrending to return to the solution space. Use only in combination with log_transform parameter in retrend_predictions()

Details

Apply retrend_predictions() to predictions to return to the original data units.

Value

List of 3 elements. 2 dataframes: detrended train, apply and the trend function

Examples

data(mock_env_data)
split_data <- list(
  train = mock_env_data[1:80, ],
  apply = mock_env_data[81:100, ]
)
detrended_list <- detrend(split_data, mode = "linear")
detrended_train <- detrended_list$train
detrended_apply <- detrended_list$apply
trend <- detrended_list$model

ubair documentation built on April 12, 2025, 2:12 a.m.