dindexm: Distributional index model (DIM)
In isodistrreg: Isotonic Distributional Regression (IDR)

Description Usage Arguments Details Value References See Also Examples

Fits distributional index model with user-specified index function to training dataset. See the examples at the bottom to learn how to specify a distributional single index model.

dindexm(
  formula,
  indexfit,
  data,
  response,
  pars = osqpSettings(verbose = FALSE, eps_abs = 1e-05, eps_rel = 1e-05, max_iter =
    10000L),
  progress = TRUE,
  ...
)

`formula`	object of class `formula` that describes the index model
`indexfit`	function that fits the index model to training data. Should accept arguments `formula` and `data` and admit a `predict` method. Further arguments in `...` are passed to indexfit. See examples.
`data`	`data.frame` containing the covariates of the index model and the response variable.
`response`	name of the response variable in `data`.
`pars`	parameters for quadratic programming optimization (only relevant for multivariate index functions), set using `osqpSettings`.
`progress`	display progressbar for fitting idr?
`...`	further arguments passed to `indexfit`.

This function fits a distributional index model (DIM) to training data. The DIM assumes that the response is more likely to attain higher values when the values of the index function increases. The index function can be estimated by parametric methods like lm or glm or also nonparametrically.

The formal mathematical assumption of the DIM is that the conditional CDFs F_{y | g(X) = g(x)}(z) at each fixed threshold z decreases, as g(x) increases. Here y denotes the response, x, X are the covariates in data and g is the index function estimated by indexfit.

Estimation is performed in two steps: indexfit is applied to data to estimate the function g. With this estimate, idr is applied with the pseudo-covariates g(x) and response y.

Object of class dindexm: A list containing the index model (first component) and the IDR fit on the pseudo-data with the index as covariate (second component).

Henzi, A., Kleger, G. R., & Ziegel, J. F. (2020). Distributional (Single) Index Models. arXiv preprint arXiv:2006.09219.

idr for more information on IDR, predict.dindexfit for (out-of-sample) predictions based on a model with with dindexm.

n <- 1000
X <- data.frame(x1 = rnorm(n), x2 = rnorm(n), x3 = rnorm(n))
y <- rnorm(n, 1 - X[, 1] + X[, 2]^2 / 3 - (1 - X[, 3]) * (1 + X[, 3]) / 2)
data <- cbind(y = y, as.data.frame(X))

## data for out-of-sample prediction
newX <- data.frame(x1 = rnorm(10), x2 = rnorm(10), x3 = rnorm(10))

## linear regression model for index
model <- dindexm(
  formula = y ~ poly(x1, degree = 2) + poly(x2, degree = 2) + 
    poly(x3, degree = 2),
  indexfit = lm,
  response = "y",
  data = data
)
pred <- predict(model, data = newX)

## plot
plot(pred, 1, main = "LM based DIM")
grd <- pred[[1]]$points
trueCdf <- pnorm(
  grd,
  1 - newX[1, 1] + newX[1, 2]^2 / 3 - (1 - newX[1, 3]) * (1 + newX[1, 3]) / 2
)
points(grd, trueCdf, type = "l", col = 2)