deepregression: Fitting Semi-Structured Deep Distributional Regression

View source: R/deepregression.R

deepregressionR Documentation

Fitting Semi-Structured Deep Distributional Regression

Description

Fitting Semi-Structured Deep Distributional Regression

Usage

deepregression(
  y,
  list_of_formulas,
  list_of_deep_models = NULL,
  family = "normal",
  data,
  tf_seed = as.integer(1991 - 5 - 4),
  return_prepoc = FALSE,
  subnetwork_builder = subnetwork_init,
  model_builder = keras_dr,
  fitting_function = utils::getFromNamespace("fit.keras.engine.training.Model",
    "keras"),
  additional_processors = list(),
  penalty_options = penalty_control(),
  orthog_options = orthog_control(),
  weight_options = weight_control(),
  formula_options = form_control(),
  output_dim = 1L,
  verbose = FALSE,
  ...
)

Arguments

y

response variable

list_of_formulas

a named list of right hand side formulas, one for each parameter of the distribution specified in family; set to ~ 1 if the parameter should be treated as constant. Use the s()-notation from mgcv for specification of non-linear structured effects and d(...) for deep learning predictors (predictors in brackets are separated by commas), where d can be replaced by an name name of the names in list_of_deep_models, e.g., ~ 1 + s(x) + my_deep_mod(a,b,c), where my_deep_mod is the name of the neural net specified in list_of_deep_models and a,b,c are features modeled via this network.

list_of_deep_models

a named list of functions specifying a keras model. See the examples for more details.

family

a character specifying the distribution. For information on possible distribution and parameters, see make_tfd_dist. Can also be a custom distribution.

data

data.frame or named list with input features

tf_seed

a seed for TensorFlow (only works with R version >= 2.2.0)

return_prepoc

logical; if TRUE only the pre-processed data and layers are returned (default FALSE).

subnetwork_builder

function to build each subnetwork (network for each distribution parameter; per default subnetwork_init). Can also be a list of the same size as list_of_formulas.

model_builder

function to build the model based on additive predictors (per default keras_dr). In order to work with the methods defined for the class deepregression, the model should behave like a keras model

fitting_function

function to fit the instantiated model when calling fit. Per default the keras fit function.

additional_processors

a named list with additional processors to convert the formula(s). Can have an attribute "controls" to pass additional controls

penalty_options

options for smoothing and penalty terms defined by penalty_control

orthog_options

options for the orthgonalization defined by orthog_control

weight_options

options for layer weights defined by weight_control

formula_options

options for formula parsing (mainly used to make calculation more efficiently)

output_dim

dimension of the output, per default 1L

verbose

logical; whether to print progress of model initialization to console

...

further arguments passed to the model_builder function

References

Ruegamer, D. et al. (2023): deepregression: a Flexible Neural Network Framework for Semi-Structured Deep Distributional Regression. doi: 10.18637/jss.v105.i02.

Examples

library(deepregression)

n <- 1000
data = data.frame(matrix(rnorm(4*n), c(n,4)))
colnames(data) <- c("x1","x2","x3","xa")
formula <- ~ 1 + deep_model(x1,x2,x3) + s(xa) + x1

deep_model <- function(x) x %>%
layer_dense(units = 32, activation = "relu", use_bias = FALSE) %>%
layer_dropout(rate = 0.2) %>%
layer_dense(units = 8, activation = "relu") %>%
layer_dense(units = 1, activation = "linear")

y <- rnorm(n) + data$xa^2 + data$x1

mod <- deepregression(
  list_of_formulas = list(loc = formula, scale = ~ 1),
  data = data, y = y,
  list_of_deep_models = list(deep_model = deep_model)
)

if(!is.null(mod)){

# train for more than 10 epochs to get a better model
mod %>% fit(epochs = 10, early_stopping = TRUE)
mod %>% fitted() %>% head()
cvres <- mod %>% cv()
mod %>% get_partial_effect(name = "s(xa)")
mod %>% coef()
mod %>% plot()

}

mod <- deepregression(
  list_of_formulas = list(loc = ~ 1 + s(xa) + x1, scale = ~ 1,
                          dummy = ~ -1 + deep_model(x1,x2,x3) %OZ% 1),
  data = data, y = y,
  list_of_deep_models = list(deep_model = deep_model),
  mapping = list(1,2,1:2)
)


deepregression documentation built on Jan. 18, 2023, 1:11 a.m.