create_list_from_scratch: Create a sparse list representation of treatment-to-control...

View source: R/create_list_from_scratch.R

create_list_from_scratchR Documentation

Create a sparse list representation of treatment-to-control distance matrix with a caliper.

Description

This function takes in a n-by-p matrix of observed covariates, a length-n vector of treatment indicator, a caliper, and construct a possibly sparse list representation of the distance matrix.

Usage

create_list_from_scratch(
  Z,
  X,
  exact = NULL,
  soft_exact = FALSE,
  p = NULL,
  caliper_low = NULL,
  caliper_high = NULL,
  k = NULL,
  alpha = 1,
  penalty = Inf,
  method = "maha",
  dist_func = NULL
)

Arguments

Z

A length-n vector of treatment indicator.

X

A n-by-p matrix of covariates.

exact

A vector of strings indicating which variables need to be exactly matched.

soft_exact

If set to TRUE, the exact constraint is enforced up to a large penalty.

p

A length-n vector on which a caliper applies, e.g. a vector of propensity score.

caliper_low

Size of caliper low.

caliper_high

Size of caliper high.

k

Connect each treated to the nearest k controls. See details section.

alpha

Tuning parameter.

penalty

Penalty for violating the caliper. Set to Inf by default.

method

Method used to compute treated-control distance

dist_func

A user-specified function that compute treate-control distance. See details section.

Details

Currently, there are 4 methods implemented in this function: 'maha' (Mahalanobis distance), robust maha' (robust Mahalanobis distance), '0/1' (distance = 0 if and only if covariates are the same), 'Hamming' (Hamming distance).

Users can also supply their own distance function by setting method = 'other' and using the argument “dist_func”. “dist_func” is a user-supplied distance function in the following format: dist_func(controls, treated), where treated is a length-p vector of covaraites and controls is a n_c-by-p matrix of covariates. The output of function dist_func is a length-n_c vector of distance between each control and the treated.

There are two options for users to make a network sparse. Option caliper is a value applied to the vector p to avoid connecting treated to controls whose covariate or propensity score defined by p is outside p +/- caliper. Second, within a specified caliper, sometimes there are still too many controls connected to each treated, and we can further trim down this number up to k by restricting our attention to the k nearest (in p) to each treated.

By default a hard caliper is applied, i.e., option penalty is set to Inf by default. Users may make the caliper a soft one by setting penalty to a large yet finite number.

Value

This function returns a list of three objects: start_n, end_n, and d. See documentation of function “create_list_from_mat” for more details.

Examples

## Not run: 
# We first prepare the input X, Z, propensity score

attach(dt_Rouse)
X = cbind(female,black,bytest,dadeduc,momeduc,fincome)
Z = IV
propensity = glm(IV~female+black+bytest+dadeduc+momeduc+fincome,
                family=binomial)$fitted.values
detach(dt_Rouse)

# Create distance lists with built-in options.

# Mahalanobis distance with propensity score caliper = 0.05
# and k = 100.

dist_list_pscore_maha = create_list_from_scratch(Z, X, p = propensity,
                               caliper_low = 0.05, k = 100, method = 'maha')


# More examples, including how to use a user-supplied
# distance function, can be found in the vignette.

## End(Not run)

match2C documentation built on March 31, 2023, 6:39 p.m.