dsmatchATT: Double Score Matching Estimator for Average Treatment Effect...
In Yunshu7/dsmatch: Use double score matching algorithm to estimate ATE and QTE

Description Usage Arguments Details Value Examples

View source: R/dsmatchATT.R

dsmatchATT applys matching algorithm to estimate average treatment effect for the treated based on propensity score and prognostic score. Classical matching algortihms, including propensity score matching, prognositc score matching and matching directly on covarites are also contained for comparison. Covariate balance results are also provided.

dsmatchATT(
  Y,
  X,
  A,
  method = "dsm",
  model.ps = "other",
  ps = NULL,
  lp.ps = NULL,
  model.pg = "other",
  pg = NULL,
  lp.pg = NULL,
  caliper = NULL,
  replace = T,
  cov.balance = F,
  varest = F,
  boots = 100,
  mc = F,
  ncpus = 4,
  ...
)

`Y`	Outcome as numeric vector.
`X`	Covarites as numeric vector or matrix.
`A`	Treatment assignment as numeric vector with `1` stands for treatment group and `0` stands for control group.
`method`	Matching method to use, including `"dsm"` as double score matching, `"ps"` as propensity score matching, `"pg"` as prognostic score matching and `"cov"` as matching on covariates directly.
`model.ps`	Fitted model for propensity score, including `"logit"` as logistic model, `"probit"` as probit model, `"linpred"` as logistic model with linear predictors specified by `"lp.ps"`. Don't need to be specified if `ps` is given.
`ps`	Propensity score as numeric vector given by user. Don't need to be specified if `model.ps` is given.
`lp.ps`	Linear predictors for propensity score as numeric vector or matrix. Don't need to be specified if `model.ps` is not `"linpred"`.
`model.pg`	Fitted model for prognostic score, including `"glm"` as linear model for continuous outcome, `"glm_logit"` as logistic model for binary outcome, `"glm_probit"` as probit model for binary outcome, `"linpred"` as linear model for continuous outcome with linear predictors specified by `"lp.pg"`, `"zir_logit"` as zero inflated model using logistic model to fit non-zero probability, `"zir_probit"` as zero inflated model using probit model to fit non-zero probability. Don't need to be specified if `pg` is given.
`pg`	Prognostic score as numeric matrix given by user. The first column is potential outcome for control group and the second column is potential outcome for treatment group. Don't need to be specified if `model.pg` is given.
`lp.pg`	Linear predictors for prognostic score as numeric vector or matrix. Don't need to be specified if `model.pg` is not `"linpred"`.
`caliper`	A scalar or vector denoting the caliper(s) which should be used when matching. A caliper is the distance which is acceptable for any match. Observations which are outside of the caliper are dropped. If a scalar caliper is provided, this caliper is used for all covariates in X. If a vector of calipers is provided, a caliper value should be provided for each covariate in X. See function `Match` in `Matching` for more details.
`replace`	A logical flag for whether matching should be done with replacement. Note that if FALSE, the order of matches generally matters. See function `Match` in `Matching` for more details.
`cov.balance`	A logical scalar for whether covariance balance results should be shown.
`varest`	A logical scalar for whether variance of estimator should be estimated.
`boots`	A numeric scalar for number of bootstrap relicates in variance estimation. Don't need to be specified if `varest` is `F`.
`mc`	A logical scalar for whether multiple cores are used in variance estimation. Don't need to be specified if `varest` is `F`.
`ncpus`	A numeric scalar for number of cores used in variance estimation. Don't need to be specified if `varest` is `F`.
`...`	Additional parameters for `Match` in `Matching`.

For both propensity socre and prognostic score, user should either select a model or provide the score directly. If linear predictors are used to fit a logistic model for propensity score or a linear model for prognostic score, they should be determined by lp.ps or lp.pg argument. If model.ps (and lp.ps if linear predictors are used) is given, then ps does not need to be specified, and vice versa. However, if propensity socre is given by ps while model.ps is chosen at the same time, the model will be ignored and matching will be based on the score given by the user directly. A warning will be thrown if this situation happens. Similar results for prognostic score.

A special model for prognostic score is the zero inflated regression model, which fits a logistic model for the probability to be zero, and a regression model for the non-zero values.

Some parameters in Match in Matching has already been set as parameters in the function, such as caliper and replace. Addtional parameters for Match function can also be assigned by ... except that tie and Weight has already been specified in the function.

Results are put in a list:

`est.ds`	Point estimate of ATT if matching is based on double score.
`est.ps`	Point estimate of ATT if matching is based on propensity score.
`est.pg`	Point estimate of ATT if matching is based on prognostic score.
`est.x`	Point estimate of ATT if matching is based on covarites directly.
`boot.var`	Variance of estimator estimated by bootstrap. Meaningless if `varest` if `F`.
`bootq1`	0.025 quantile of estimator estimated by bootstrap. Meaningless if `varest` if `F`.
`bootq2`	0.975 quantile of estimator estimated by bootstrap. Meaningless if `varest` if `F`.
`cov.bal`	standard difference in mean for all covariates.
`matching.detail`	returned object from function `Match` in package `Matching`, not included for dsmatch with caliper.
`matching.rate`	matching rate due to caliper or replacement.

# import lalonde data from package "lalonde"
library(lalonde)
nsw <- lalonde::nsw
cps3 <- lalonde::cps_controls3

# combine datasets
nsw <- nsw[,-1]
cps3 <- cps3[,c(2,3,4,5,6,7,8,10,11)]
lalonde <- rbind(nsw, cps3)

# preprocessing of data
Y = lalonde[,"re78"]
Y = as.matrix(Y)
Y = as.vector(Y)
X = lalonde[,c("age","education","black","hispanic","married","nodegree","re75")]
X = as.matrix(X)
A = lalonde[,"treat"]
A = as.matrix(A)
A = as.vector(A)

# linear predictors using in the algorithm
# take logarithm for income and standardize covariates
Z = X
Z[,"re75"] = log(Z[,"re75"] + 1)
Z[,"age"] = (Z[,"age"] - mean(Z[,"age"])) / sd(Z[,"age"])
Z[,"education"] = (Z[,"education"] - mean(Z[,"education"])) / sd(Z[,"education"])
Z[,"re75"] = (Z[,"re75"] - mean(Z[,"re75"])) / sd(Z[,"re75"])
Z = cbind(X, Z[,"age"]^2, Z[,"education"]^2, Z[,"re75"]^2)

# estimate ATT using four matching methods
set.seed(1)
dsmatchATT(Y, X, A, method = "dsm", model.ps = "linpred", lp.ps = Z, model.pg = "linpred", lp.pg = Z, varest = T, cov.balance = T)
dsmatchATT(Y, X, A, method = "ps", model.ps = "linpred", lp.ps = Z, model.pg = "linpred", lp.pg = Z, varest = T, cov.balance = T)
dsmatchATT(Y, X, A, method = "pg", model.ps = "linpred", lp.ps = Z, model.pg = "linpred", lp.pg = Z, varest = T, cov.balance = T)
dsmatchATT(Y, X, A, method = "cov", model.ps = "linpred", lp.ps = Z, model.pg = "linpred", lp.pg = Z, varest = T, cov.balance = T)

# estimate QTT using double score matching
p = 0.3
set.seed(1)
res <- dsmatchQTT(Y, X, A, p, method = "dsm", model.ps = "linpred", lp.ps = Z, model.pg = "linpred", lp.pg = Z, varest = T)
res
# Wald interval for QTT
res$est.ds + qnorm(0.025) * sqrt(res$bootvar)
res$est.ds - qnorm(0.025) * sqrt(res$bootvar)