DAPSest is a function that can be used to perform Distance Adjusted Propensity Score Matching.
glm(Z ~ X, data = dataset, family = binomial)
.prop.scores
in the data frame of observations.We are now ready to fit DAPSm to the data. The arguments on the function are described here in detail:
dataset
: This is a data frame where each row is an observation including all the information described above (treatment, outcome, covariates, coordinates and propensity score estimates as "prop.scores").out.col
: If the data frame includes the outcome column as 'Y', then this does not need to be specified. If not, set out.col equal to the index of the data frame that includes the outcome information.trt.col
: If the treatment is included in the data frame with the name 'X', this argument does not need to be specified. If not, set trt.col to the column index that includes the binary treatment information.caliper
: The caliper that will be used for matching.weight
: This corresponds to the $w$ argument in the definition of DAPS. Higher values of weight correspond to matching score based more on the propensity score difference, and less on proximity.coords.columns
: If the column names corresponding to the coordinates are named 'Longitude' and 'Latitude' this does not need to be specified. Otherwise, set coords.columns to be the indeces of the coordinate columns.pairsRet
: Logical. If set to TRUE, a matrix of information on the matched pairs will be returned. This will include unit IDs, coordinate, treatment, outcome and propensity score of the matched pairs. Each matched pair corresponds to a row, so we can identify which treated is matched to which control.cov.cols
: Specify this if the weight is set to 'optimal'. Otherwise, it will be ignored. This argument described the indices of the columns that include observed covariates. In the case of the optimal weight choice, the algorithm will ensure that the variables with indices in cov.cols are balanced.cutoff
: The cutoff of ASDM below which a variable is balanced. Only required when the weight is set to 'optimal'.w_tol
: Specify when weight = 'optimal'
. This specifies how big is our tolernace with the choice of the optimal $w$. If w_tol
is small, then more iterations will be needed to choose the final value of $w$.coord_dist
: Logical. If set to TRUE geo-distance will be calculated from the longitude and latitude information given in the dataset. If set to FALSE, Euclidean distance will be calculated instead.distance
: Function. Since the propensity score difference and distance are combined into one quantity (DAPS), they should be on similar scales. For example, distance values that are too large would dominate over the definitiono of DAPS. For that reason, distance
needs to be specified as a function that takes in a matrix of arbitrarily large distances and turns it into a matrix of elements scaled to the propensity score difference scale. The package includes two choices (but any other function would also work):StandDist
: This function takes in a distance matrix with entries $d_{ij}$ and returns a matrix with corresponding entries
$$ D_{ij} = \frac{d_{ij} - min_{ij}d_{ij}}{max_{ij}d_{ij} - min_{ij}d_{ij}} $$EmpCDF
: This function takes in a distance matrix with entries $d_{ij}$ and returns a matrix with corresponding values equal to the empirical cdf of $d$.caliper_type
: Should be 'DAPS' or 'PS'. This specifies which quantity we want to set the caliper on.caliper_type = 'PS'
.quiet
: In the case where weight is set to optimal, the quiet argument controls whether we want the algorithm to print the current stage of the optimal $w$ search. Defaults to FALSE.true_value
: If used, an indicator is returned set to TRUE if the confidence interval includes the true value, or FALSE if it doesn't.As mentioned before the optimal weight algorithm described above is a fast search for the optimal $w$ defined as the minimum $w$ for which the absolute standardized difference of means (ASDM) of observed covariates is less than a cutoff. This might be appropriate for very large data sets for which performing an extensive search for the optimal value of $w$ is not feasible. The fast search defined above is based on the assumption that ASDM is decreasing as more and more matching weight is given to the propensity score (increasing $w$).
However, it is often the case that this might not be true. For example, if one of the observed covariates in our data set is spatially structured, then distance might work adequately to balance it, and therefore the trend of ASDM as a function of $w$ will not necessarily be decreasing.
For that reason, we suggest a second option for calculating the optimal $w$. This algorithm still defines the optimal $w$ as the minimum $w$ for which ASDM of all covariates is below a cutoff. However, one can instead scan multiple values of $w$ ranging from 0 to 1, and specificly choose the smallest one that acheives balance.
CalcDAPSWeightBalance()
.PlotWeightBalance()
to plot the standardized difference of means as a function of $w$.DAPSchoiceModel()
to choose the optimal $w$ and acquire the effect estimates and set of matched pairs.DAPSWeightCE()
plots the estimates for the varying values of $w$, along with 95% confidence intervals and loess smoothing curve. This should NOT be used to choose the value of $w$ that will be reported.Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.