optint: Optimal intervention
In optinterv: Optimal Intervention

Description Usage Arguments Value Examples

identifies the factors with the greatest potential to increase a pre-specified outcome, using varius methods.

optint(Y, X, control = NULL, wgt = rep(1, length(Y)),
  method = "non-parametric", lambda = 100, sigma = 1,
  grp.size = 30, n.boot = 1000, sign.factor = 2/3, alpha = 0.05,
  n.quant = floor(length(Y)/10), perm.test = T, n.perm = 1000,
  quick = F, plot = T, seed = runif(1, 0, .Machine$integer.max))

`Y`	outcome vector (must be numeric without NA's).
`X`	numeric data frame or matrix of factors to be considered.
`control`	numeric data frame or matrix of factors to control for. these are factors that we can't consider while looking for the optimal intervention (e.g. race).
`wgt`	an optional vector of weights.
`method`	the method to be used. either "non-parametric" (default), "correlation" or "nearest-neighbors".
`lambda`	the lagrange multiplier. also known as the shadow price of an intervention.
`sigma`	distance penalty for the nearest-neighbors method.
`grp.size`	for the nearest-neighbors method; if the number of examples in each control group is smaller than grp.size, performs weight adjustment using `wgt_adjust`. else, calculate weights seperatly for each control group.
`n.boot`	number of bootstrap replications to use for the standard errors / confidence intervals calculation.
`sign.factor`	what proportion of quantiles should to be increased (decreased) in order to return a positive (negative) sign? not relevant for the correlation method (there the correlation sign is returned).
`alpha`	significance level for the confidence intervals.
`n.quant`	number of quantiles to use when calculating CDF distance.
`perm.test`	logical. if TRUE (default) performs permutation test and calculates p-values.
`n.perm`	number of permutations for the permutation test.
`quick`	logical. if TRUE, returns only E(X \| I=1) - E(X \| I=0) as an estimate. this estimate is used by `optint_by_group`.
`plot`	logical. if TRUE (default), the results are plotted by `plot.optint`.
`seed`	the seed of the random number generator.

an object of class "optint". This object is a list containing the folowing components:

`estimates`	standardized point estimates (correlations for the correlation method and cdf distances otherwise).
`estimates_sd`	estimates standard deviation.
`details`	a list containing further details, such as:

Y_diff - E(Y | I=1) - E(Y | I=0).
Y_diff_sd - standard deviation for Y_diff.
method - the method used.
lambda - the lagrange multiplier used.
signs - signs (i.e. directions) for the estimates.
p_value - p-values for the estimates.
ci - a matrix of confidence intervals for the estimates.
stand_factor - the standardization factor used to standardize the results.
kl_distance - the Kullback–Leibler divergence of P(X | I=0) from P(X | I=1).
new_sample - a data frame containing X, control (if provided), wgt (the original weights) and wgt1 (the new weights under I = 1.)

In addition, the function summary can be used to print a summary of the results.

# generate data
n <- 50
p <- 3
features <- matrix(rnorm(n*p), ncol = p)
men <- matrix(rbinom(n, 1, 0.5), nrow = n)
outcome <- 2*(features[,1] > 1) + men*pmax(features[,2], 0) + rnorm(n)
outcome <- as.vector(outcome)

#find the optimal intervention using the non-parametric method:
imp_feat <- optint(Y = outcome, X = features, control = men,
                   method = "non-parametric", lambda = 10, plot = TRUE,
                   n.boot = 100, n.perm = 100)

#by default, only the significant features are displayed
#(see ?plot.optint for further details).
#for customized variable importance plot, use plot():
plot(imp_feat, plot.vars = 3)

#show summary of the results using summary():
summary(imp_feat)