sparsedistreg: Fitting Sparse Distributional Regression Models

View source: R/main.R

sparsedistregR Documentation

Fitting Sparse Distributional Regression Models

Description

Fitting Sparse Distributional Regression Models

Usage

sparsedistreg(
  y,
  list_of_formulas = NULL,
  family = "normal",
  data,
  type = "reduced",
  gammas = rep(0, length(list_of_formulas)),
  lambdas = rep(0, length(list_of_formulas)),
  sterm_options = sterm_control(),
  penalty_options = penalty_control(),
  ...
)

Arguments

y

response variable

list_of_formulas

a named list of right hand side formulas, one for each parameter of the distribution specified in family; set to ~ 1 if the parameter should be treated as constant. It is only required to supply the covariate names, e.g., ~ 1 + x + z + a + b + c. If the argument is NULL, all covariates in data are used.

family

a character specifying the distribution. For information on possible distribution and parameters, see make_tfd_dist. Can also be a custom distribution

data

data.frame or named list with input features

type

character; if "reduced" (default), then P-splines with basis 3 and penalty order 2 are used which have a linear null space. "standard" uses P-splines with basis 2 and 0-order penalty

gammas

vector; of length(list_of_formulas) with entries in the range [0,1]. Each value specifies the amount of non-linearity in each additive predictor (0 means maximum penalization for non-linear terms, 1 maximum flexibility). Only used for type = "standard"

lambdas

a vector of length(list_of_formulas) specifying the amount of sparsity in each additive predictor (0 means no sparsity)

sterm_options

options for smooth terms defined in sterm_control

penalty_options

options for penalization, see ?deepregression

...

further arguments passed to deepregression

Examples

library(sparsedistreg)

set.seed(32)

data <- data.frame(a=rnorm(100), b=rnorm(100), c=rnorm(100))
y <- rnorm(100) + data$a

# fit a model with lambdas = gammas = 0
mod <- sparsedistreg(
  y = y,
  data = data,
  type = "standard"
)

mod %>% fit(epochs=1500L, early_stopping=TRUE)

mod %>% plot()

# fit a model with more sparsity penalty for location
# parameter (2) and 20/80% penalty for linear/non-linear
mod <- sparsedistreg(
  y = y,
  data = data,
  type = "standard",
  gammas = c(0.2,0),
  lambdas = c(2,0)
)

mod %>% fit(epochs=2000L, early_stopping=TRUE)

mod %>% plot()
mod %>% coef()

# now set the penalty for linear effects in
# the location to 100%
mod <- sparsedistreg(
  y = y,
  data = data,
  type = "standard",
  gammas = c(1,0),
  lambdas = c(200,0)
)

mod %>% fit(epochs=2000L, early_stopping=TRUE)

mod %>% plot()
mod %>% coef()

# penalize linear effect 100%
mod <- sparsedistreg(
  y = y,
  list_of_formulas = list(
  ~ a,
  ~ b + c
  ),
  data = data,
  type = "standard",
  gammas = c(1,0),
  lambdas = c(2,0)
)

mod %>% fit(epochs=2000L, early_stopping=TRUE)

# plots only the mean, so s(a)
mod %>% plot()
# linear effect completely absorbed into spline
mod %>% coef()

##### reduced form now
# first penalize s(a) to null effect
mod <- sparsedistreg(
  y = y,
  list_of_formulas = list(
  ~ a,
  ~ b + c
  ),
  data = data,
  type = "reduced",
  lambdas = c(1,0)
)
mod %>% fit(epochs=2000L, early_stopping=TRUE)

mod %>% plot()

# now less lambda penalty

mod <- sparsedistreg(
  y = y,
  list_of_formulas = list(
  ~ a,
  ~ b + c
  ),
  data = data,
  type = "reduced",
  lambdas = c(0,0)
)
mod %>% fit(epochs=2000L, early_stopping=TRUE)

mod %>% plot()

# now additional term with no influence
mod <- sparsedistreg(
  y = y,
  list_of_formulas = list(
  ~ a + b,
  ~ c
  ),
  data = data,
  type = "reduced",
  lambdas = c(4,0),
)
mod %>% fit(epochs=2000L, early_stopping=TRUE)

mod %>% plot()

# now with p-splines instead of thin-plate

mod <- sparsedistreg(
  y = y,
  list_of_formulas = list(
  ~ a + b,
  ~ c
  ),
  data = data,
  type = "reduced",
  lambdas = c(4,0)
)
mod %>% fit(epochs=2000L, early_stopping=TRUE)

mod %>% plot()


neural-structured-additive-learning/sparsedistreg documentation built on May 13, 2022, 3:56 a.m.