optint_by_group: Optimal intervention, by group
In optinterv: Optimal Intervention

Description Usage Arguments Value Examples

Similar to optint, identifies the factors with the greatest potential to increase a pre-specified outcome for each group separately, and thus allowing to detect heterogeneity between groups.

1
2
3

optint_by_group(Y, X, group, control = NULL, wgt = rep(1, length(Y)),
  method = "non-parametric", lambda = 100, sigma = 1,
  grp.size = 30, n.boot = 1000, alpha = 0.05, plot = TRUE)

`Y`	outcome vector (must be numeric without NA's).
`X`	numeric data frame or matrix of factors to be considered.
`group`	vector with group labels (i.e. grouping variable). the function `optint` implemented for each group separately.
`control`	numeric data frame or matrix of factors to control for. these are factors that we can't consider while looking for the optimal intervention (e.g. race).
`wgt`	an optional vector of weights.
`method`	the method to be used. either "non-parametric" (default), "correlation" or "nearest-neighbors".
`lambda`	the lagrange multiplier. also known as the shadow price of an intervention.
`sigma`	distance penalty for the nearest-neighbors method.
`grp.size`	for the nearest-neighbors method; if the number of examples in each control group is smaller than grp.size, performs weight adjustment using `wgt_adjust`. else, calculate weights seperatly for each control group.
`n.boot`	number of bootstrap replications to use for the standard errors / confidence intervals calculation.
`alpha`	significance level for the confidence intervals.
`plot`	logical. if TRUE (default), the results are plotted by `plot.optint_by_group`.

an object of class "optint_by_group". This object is a list containing two components:

`est`	a matrix of estimates (in their original units), for each group. here estimates are E(X \| I=1) - E(X \| I=0), and they are used by `plot.optint_by_group`.
`sd`	estimates standard deviation.

# generate data
n <- 50
p <- 3
features <- matrix(rnorm(n*p), ncol = p)
men <- matrix(rbinom(n, 1, 0.5), nrow = n)
outcome <- 2*(features[,1] > 1) + men*pmax(features[,2], 0) + rnorm(n)
outcome <- as.vector(outcome)

#find the optimal intervention using the non-parametric method:
imp_feat <- optint(Y = outcome, X = features, control = men,
                   method = "non-parametric", lambda = 10, plot = TRUE,
                   n.boot = 100, n.perm = 100)

#we can explore how the optimal intervention varies between genders using optint_by_group():
men <- as.vector(men)
imp_feat_by_gender <- optint_by_group(Y = outcome, X = features,
                                      group = men,
                                      method = "non-parametric",
                                      lambda = 10)

#by default, only the significant features are displayed
#(see ?plot.optint_by_group for further details).
#for customized variable importance plot, use plot():
plot(imp_feat_by_gender, plot.vars = 3)