getdist: L1 Distance

View source: R/functions.R

getdistR Documentation

L1 Distance

Description

Calculates the L1 distance between the treated or population units and the kernel balanced control or sampled units.

Usage

getdist(
  target,
  observed,
  K,
  w.pop = NULL,
  w = NULL,
  numdims = NULL,
  ebal.tol = 1e-06,
  ebal.maxit = 500,
  svd.U = NULL
)

Arguments

target

a numeric vector of length equal to the total number of units where population/treated units take a value of 1 and sample/control units take a value of 0.

observed

a numeric vector of length equal to the total number of units where sampled/control units take a value of 1 and population/treated units take a value of 0.

K

the kernel matrix

w.pop

an optional vector input to specify population weights. Must be of length equal to the total number of units (rows in svd.U) with all sampled units receiving a weight of 1. The sum of the weights for population units must be either 1 or the number of population units.

w

a optional numeric vector of weights for every observation. Note that these weights should sum to the total number of units, where treated or population units have a weight of 1 and control or sample units have appropriate weights derived from kernel balancing with mean 1, is consistent with the ouput of getw(). If unspecified, these weights are found internally using numdims dimensions of the SVD of the kernel matrix svd.U with ebalance_custom().

numdims

an optional numeric input specifying the number of columns of the singular value decomposition of the kernel matrix to use when finding weights when w is not specified.

ebal.tol

an optional numeric input specifying the tolerance level used by custom entropy balancing function ebalance_custom() in the case that w is not specified.

ebal.maxit

maximum number of iterations in optimization search used by ebalance_custom when w is not specified.

svd.U

an optional matrix of left singular vectors from performing svd() on the kernel matrix in the case that w is unspecified. If unspecified when w also not specified, internally computes the svd of K.

Value

L1

a numeric giving the L1 distance, the absolute difference between pX_D1 and pX_D0w

w

numeric vector of weights used

pX_D1

a numeric vector of length equal to the total number of observations where the nth entry is the sum of the kernel distances from the nth unit to every treated or population unit. If population units are specified, this sum is weighted by w.pop accordingly.

pX_D0

a numeric vector of length equal to the total number of observations where the nth entry is the sum of the kernel distances from the nth unit to every control or sampled unit.

pX_D0w

a numeric vector of length equal to the total number of observations where the nth entry is the weighted sum of the kernel distances from the nth unit to every control or sampled unit. The weights are given by entropy balancing and produce mean balance on \phi(X), the expanded features of X using a given kernel \phi(.), for the control or sample group and treated group or target population.

Examples


#loading and cleaning lalonde data
data(lalonde)
xvars=c("age","black","educ","hisp","married","re74","re75","nodegr","u74","u75")

#need to first build gaussian kernel matrix
K_pass <- makeK(allx = lalonde[,xvars])
#also need the SVD of this matrix
svd_pass <- svd(K_pass)

#running without passing weights in directly, using numdims=33
l1_lalonde <- getdist(target = lalonde$nsw,
                      observed = 1-lalonde$nsw,
                      K = K_pass,
                      svd.U = svd_pass$u,
                      numdims = 33)

 #alternatively, we can get the weights ourselves and pass them in directly
 #using the first 33 dims of svd_pass$u to match the above
w_opt <- getw(target= lalonde$nsw,
              observed = 1-lalonde$nsw,
              svd.U = svd_pass$u[,1:33])$w
l1_lalonde2 <- getdist(target = lalonde$nsw,
                 observed = 1-lalonde$nsw,
                 K = K_pass,
                 w = w_opt)


chadhazlett/KBAL documentation built on Jan. 3, 2024, 9:57 p.m.