deepregression: Fitting Semi-Structured Deep Distributional Regression

View source: R/deepregression.R

deepregressionR Documentation

Fitting Semi-Structured Deep Distributional Regression

Description

Fitting Semi-Structured Deep Distributional Regression

Usage

deepregression(
  y,
  list_of_formulas,
  list_of_deep_models = NULL,
  family = "normal",
  data,
  seed = as.integer(1991 - 5 - 4),
  return_prepoc = FALSE,
  subnetwork_builder = NULL,
  model_builder = NULL,
  fitting_function = NULL,
  additional_processors = list(),
  penalty_options = penalty_control(),
  orthog_options = orthog_control(),
  weight_options = weight_control(),
  formula_options = form_control(),
  output_dim = 1L,
  verbose = FALSE,
  engine = "tf",
  ...
)

Arguments

y

response variable

list_of_formulas

a named list of right hand side formulas, one for each parameter of the distribution specified in family; set to ~ 1 if the parameter should be treated as constant. Use the s()-notation from mgcv for specification of non-linear structured effects and d(...) for deep learning predictors (predictors in brackets are separated by commas), where d can be replaced by an name name of the names in list_of_deep_models, e.g., ~ 1 + s(x) + my_deep_mod(a,b,c), where my_deep_mod is the name of the neural net specified in list_of_deep_models and a,b,c are features modeled via this network.

list_of_deep_models

a named list of functions specifying a keras model. See the examples for more details.

family

a character specifying the distribution. For information on possible distribution and parameters, see make_tfd_dist. Can also be a custom distribution.

data

data.frame or named list with input features

seed

a seed for TensorFlow or Torch (only works with R version >= 2.2.0)

return_prepoc

logical; if TRUE only the pre-processed data and layers are returned (default FALSE).

subnetwork_builder

function to build each subnetwork (network for each distribution parameter; per default NULL). subnetwork builder will be chosen depending on the engine. Can also be a list of the same size as list_of_formulas.

model_builder

function to build the model based on additive predictors (per default NULL). model builder will be chosen depending on the engine. In order to work with the methods defined for the class deepregression, the model should behave like a keras model

fitting_function

function to fit the instantiated model when calling fit. Per default the keras NULL function. fit will be chosen depending on the engine.

additional_processors

a named list with additional processors to convert the formula(s). Can have an attribute "controls" to pass additional controls

penalty_options

options for smoothing and penalty terms defined by penalty_control

orthog_options

options for the orthgonalization defined by orthog_control

weight_options

options for layer weights defined by weight_control

formula_options

options for formula parsing (mainly used to make calculation more efficiently)

output_dim

dimension of the output, per default 1L

verbose

logical; whether to print progress of model initialization to console

engine

character; the engine which is used to setup the NN (tf or torch)

...

further arguments passed to the model_builder function

References

Ruegamer, D. et al. (2023): deepregression: a Flexible Neural Network Framework for Semi-Structured Deep Distributional Regression. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v105.i02")}.

Examples

library(deepregression)

n <- 1000
data = data.frame(matrix(rnorm(4*n), c(n,4)))
colnames(data) <- c("x1","x2","x3","xa")
formula <- ~ 1 + deep_model(x1,x2) + s(xa) + x1 + 
  node(x3, n_trees = 2, n_layers = 2, tree_depth = 1)

deep_model <- function(x) x %>%
layer_dense(units = 32, activation = "relu", use_bias = FALSE) %>%
layer_dropout(rate = 0.2) %>%
layer_dense(units = 8, activation = "relu") %>%
layer_dense(units = 1, activation = "linear")

y <- rnorm(n) + data$xa^2 + data$x1

mod <- deepregression(
  list_of_formulas = list(loc = formula, scale = ~ 1),
  data = data, y = y,
  list_of_deep_models = list(deep_model = deep_model)
)

if(!is.null(mod)){

# train for more than 10 epochs to get a better model
mod %>% fit(epochs = 10, early_stopping = TRUE)
mod %>% fitted() %>% head()
cvres <- mod %>% cv()
mod %>% get_partial_effect(name = "s(xa)")
mod %>% coef()
mod %>% plot()

}

mod <- deepregression(
  list_of_formulas = list(loc = ~ 1 + s(xa) + x1, scale = ~ 1,
                          dummy = ~ -1 + deep_model(x1,x2,x3) %OZ% 1),
  data = data, y = y,
  list_of_deep_models = list(deep_model = deep_model),
  mapping = list(1,2,1:2)
)


deepregression documentation built on Aug. 25, 2025, 1:10 a.m.