GLM: Generalised Linear Model
In biggproject/biggr: biggr

View source: R/modelling.R

GLM	R Documentation

Generalised Linear Model

Description

This function is a custom model wrapper for caret R-package to train and predict Generalised Linear Models.

Usage

GLM(input_parameters = NULL)

Arguments

`formula`	-arg for train()- <formula> providing the model output feature and the model input features. Inputs can be columns defined in data argument and/or features described in transformationSentences argument.
`data`	-arg for train()- <data.frame> containing the output feature and all the raw input features used to train the model.
`trainMask`	-agr for train()- <array> providing the mask (TRUE if yes, FALSE if no) for the dataset used for model training. This mask will be considered after all transformation procedures. The array must be the same length as the number of rows in the data argument.
`numericStatusVariable`	-arg for train()- <string> defining the name of the column in data that contains the numerical status information (normally 0 and 1) of the output variable. If it is not NULL (default), all model coefficients are fitted considering as model input the transformed input values multiplied by this numeric status variable.
`characterStatusVariable`	-arg for train()- <string> defining the name of the column in data that contains the discrete status information of the output variable. If it is not NULL (default), all model coefficients are fitted considering each one of the possible status defined in the character status variable.
`transformationSentences`	-arg for train()- <list>. Run ?data_transformation_wrapper() for details.
`familyGLM`	-arg for train()- <function> indicating the link function to be used in the model. For GLM this can be a character string naming a family function, a family function or the result of a call to a family function. Execute ?stats::family for details.
`continuousTime`	-arg for train()- <boolean> indicating if the fitting process of the model coefficients should account for the data gaps. Set to
`maxPredictionValue`	-arg for train()- <float> defining the maximum value of predictions.
`minPredictionValue`	-arg for train()- <float> defining the minimum value of predictions.
`weatherDependenceByCluster`	-arg for train()- <data.frame> containing the columns 's', 'heating', 'cooling', 'tbalh', 'tbalc'; corresponding to the daily load curve cluster, the heating dependence (TRUE or FALSE), the cooling dependance (TRUE or FALSE), the balance heating temperature, and the balance cooling temperature, respectively.
`clusteringResults`	-arg for train()- <list> from the output produced by clustering_dlc().
`newdata`	-arg for biggr::predict.train()- <data.frame> containing the input data to consider in a model prediction.
`forceGlobalInputFeatures`	-arg for biggr::predict.train()- <list> containing the input model features to overwrite in newdata. Each input feature must have length 1, or equal to the newdata's number of rows.
`forceInitInputFeatures`	-arg for biggr::predict.train()- <list> containing the last timesteps of the input features.
`forceInitOutputFeatures`	-arg for biggr::predict.train()- <list> containing the last timesteps of the output feature.
`forceOneStepPrediction`	-arg for biggr::predict.train()- <boolean> indicating if the prediction mode should be done in one step prediction mode.
`predictionIntervals`	-arg for biggr::predict.train()- <boolean> describing if the prediction should be of the average value or the prediction interval.

Value

When training: <list> containing the model, when predicting: <array> of the predicted results.

Examples

# It should be launched using the train() function for training, and 
# biggr::predict.train() function for predicting. 
# An example for model training is:
train(
 formula = Qe ~ daily_seasonality,
 data = df, # data.frame with three columns: 
            #  'time','Qe', and 'hour'; 
            # corresponding to time, electricity consumption, 
            # and hour of the day. 
            # 200 rows of data are needed considering 
            # the training control strategy that was selected 
            # in argument trControl. 
 method = GLM(
   data.frame(parameter = "nhar",
              class = "discrete")
 ),
 tuneGrid = data.frame("nhar"=4:6),
 trControl = trainControl(method="timeslice", initialWindow = 100,
                          horizon = 10, skip = 10, fixedWindow = T),
 minPredictionValue = 0,
 maxPredictionValue = max(df$Qe,na.rm=T) * 1.1,
 familyGLM = quasipoisson(),
 transformationSentences = list(
    "daily_seasonality" = c(
        "fs_components(...,featuresName='hour',nHarmonics=param$nhar,inplace=F)",
        "weekday")
   )
 )
 # An example for model prediction is:
 predictor <- crate(function(x, forceGlobalInputFeatures = NULL, predictionIntervals=F){
   biggr::predict.train(
     object = !!mod,
     newdata = x,
     forceGlobalInputFeatures = forceGlobalInputFeatures,
     predictionIntervals = predictionIntervals
   )
 })
 # An example call of the predictor function to predict Qe at certain time is:
 predictor(
     data.frame(
         time=as.POSIXct("2020-01-01 14:00:00",tz="UTC"),
         hour=15
     )
 )
 # An additional nice feature of predictors is that this object 
 # can be directly stored to MLFlow infrastructure using:
 mlflow_log_model(predictor,"example_name")
 # Last instance can only be executed if an MLFlow run was started, see:
 ?mlflow::mlflow_start_run()

biggproject/biggr documentation built on Oct. 2, 2024, 11:13 p.m.