estimate_aps: Main APS estimation function.

Description Usage Arguments Details Value Examples

View source: R/aps.R

Description

Main APS estimation function.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
estimate_aps(
  data,
  ml,
  ml_type = "stats",
  xc = NA,
  xd = NA,
  infer = FALSE,
  s = 100,
  delta = 0.8,
  L = NA,
  seed = NA,
  fcn = NA,
  parallel = F,
  cores = NA,
  ...
)

Arguments

data

Dataset containing ML input variables.

ml

ML function for treatment recommendation

ml_type

String indicating the ML object source package name, or "custom" for a user-defined algorithm that runs prediction upon function call with a single input data.table object. See details for supported packages. Defaults to "stats".

xc

Character vector of column names of the continuous variables.

xd

Character vector of column names of the discrete variables.

infer

Boolean whether to infer continuous/discrete variables from remaining columns of data. Defaults to False.

s

Number of draws for each APS estimation. Defaults to 100.

delta

Radius of sampling ball. Can be either numeric or numeric vector. Defaults to 0.8.

L

Named list where the names correspond to the names of the mixed variables in the data, and the values are numeric vectors indicating the set of discrete values for the variable.

seed

Random seed

fcn

Function to apply to output of ML function

parallel

Boolean indicator for whether to parallelize the APS estimation. Defaults to FALSE.

cores

Integer number of cores for parallelization. If NA, then detectCores() is called.

...

Additional inputs to be passed to fcn

Details

ML packages currently supported: "mlr3", "caret", "stats" (base), "randomForest", "e1071", "bestridge", "rpart", "tree", "custom" (for user-defined functions)

Approximate propensity score estimation involves taking draws X_c^1,…,X_c^s from the uniform distribution on N(X_{ci}, δ), where N(X_{ci},δ) is the p_c dimensional ball centered at X_{ci} with radius δ. X_c^1, …,X_c^s are destandardized before passed for ML inference. The estimation equation is p^s(X_i;δ) = \frac{1}{s}∑_{s=1}^{s} ML(X_c^s, X_{di}).

If neither xc nor xd are passed and infer=T, then data is assumed to be all relevant continuous inputs. If only one is passed, then the remaining variables in data are assumed to be inputs of the other type. It is recommended to pass both xc and xd in the case that not all the variables in data are relevant for the ML input, instead of relying on inference.

Value

If a single delta value is passed, then the function returns a vector of estimated Approximate Propensity Scores of the same length as the input data. If multiple delta are passed, then a list of estimated APS vectors are returned, where the keys are each delta value.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
data("iris")
# Iris examples
model <- lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, data=iris)
estimate_aps(iris, model, xc = names(iris)[2:3], xd = names(iris)[4],
            infer=FALSE, s=50, delta=0.1)

# Estimate APS while applying decision function assign_cutoff to model output with input cutoff=0.5
assign_cutoff <- function(X, cutoff){
  ret <- as.integer(X > cutoff)
  return(ret)
}
estimate_aps(iris, model, xc = names(iris)[2:3], xd = names(iris)[4],
             infer=FALSE, s=50, delta=0.1, fcn=assign_cutoff, cutoff=0.5)

# Multiple deltas
estimate_aps(iris, model, xc = names(iris)[2:3], xd = names(iris)[4],
             infer=FALSE, s=50, delta=c(0.1,0.5,1))

# Define mixed continuous/discrete variables
estimate_aps(iris, model, xc = names(iris)[2:3], xd = names(iris)[4],
             infer=FALSE, s=50, delta=0.1,
             L = list("Sepal.Width" = c(2, 3), "Petal.Length" = c(3, 4)))

factoryofthesun/r-IVaps documentation built on Dec. 20, 2021, 7:41 a.m.