estimateATT: estimateATT function

View source: R/estimateATT.R

estimateATTR Documentation

estimateATT function

Description

This function matches exposed units and unexposed units by a pre-specified buffer distance (Euclidean distance)

Usage

estimateATT(
  dataset,
  PSdataset,
  bexp,
  exp.status = 1,
  cexp,
  fmethod.replace = TRUE,
  distbuf = 0.1,
  exp.included = FALSE,
  long,
  lat,
  PS.method = "mgcv.GAM",
  PS.method.data = "original",
  PS.formula,
  PS.max_depth = 5,
  PS.eta = 0.1,
  PS.nthread = 1,
  PS.eval_metric = "logloss",
  PS.objective = "binary:logistic",
  PS.nrounds = 50,
  PS.cv.nround = 100,
  PS.cv.nfold = 5,
  PS.cv.min.burnin = 49,
  PS.early_stopping_rounds = 50,
  PS.cv.objective = "binary:logistic",
  PS.cv.max_depth = c(4, 5),
  PS.cv.eta = c(0.01, 0.05, 0.1),
  PS.cv.gamma = c(0),
  PS.cv.nthread = 1,
  PS.cv.subsample = 1,
  PS.cv.eval_metric = "logloss",
  PS.cv.colsample_bytree = 1,
  PS.cv.min_child_weight = 1,
  PS.cv.lambda = c(0),
  PS.cv.lambda_bias = c(0),
  PS.cv.alpha = c(0),
  PS.cv.scale_pos_weight = 1,
  PS.cv.local.N = 10,
  CGPS.method = "mgcv.GAM",
  CGPS.formula,
  CGPS.max_depth = 5,
  CGPS.eta = 0.1,
  CGPS.nthread = 1,
  CGPS.eval_metric = "rmse",
  CGPS.objective = "reg:squarederror",
  CGPS.nrounds = 50,
  CGPS.cv.nround = 100,
  CGPS.cv.nfold = 5,
  CGPS.cv.min.burnin = 49,
  CGPS.early_stopping_rounds = 50,
  CGPS.cv.objective = "reg:squarederror",
  CGPS.cv.max_depth = c(4, 5),
  CGPS.cv.eta = c(0.01, 0.05, 0.1),
  CGPS.cv.gamma = c(0),
  CGPS.cv.nthread = 1,
  CGPS.cv.subsample = 1,
  CGPS.cv.eval_metric = "rmse",
  CGPS.cv.colsample_bytree = 1,
  CGPS.cv.min_child_weight = 1,
  CGPS.cv.lambda = c(0),
  CGPS.cv.lambda_bias = c(0),
  CGPS.cv.alpha = c(0),
  CGPS.cv.scale_pos_weight = 1,
  CGPS.cv.local.N = 10,
  smethod = "nearest",
  caliper_bw = NULL,
  smethod.replace = FALSE,
  formulaDisease,
  family,
  bs.N,
  bs.replace = TRUE,
  corrmethod = "Pearson",
  varilist,
  modelinfo = FALSE
)

Arguments

dataset

a dataset object.

PSdataset

a dataset for PS estimation.

bexp

a character string indicating the name of the binary exposure. Use apostrophe like "VariableName"

exp.status

a numeric vector indicating the value indicating exposed units. Defalut=1

cexp

a character string indicating the name of the continuous exposure. Use apostrophe like "VariableName"

fmethod.replace

an indicator of whether one-to-n distance-matching will be performed with replacement or without replacement. Default=TRUE. If FALSE, matching is done without replacement, which of the performance has not been tested. If FALSE, note that the output of this function may differ by the order of observation units in the original dataset.

distbuf

a numeric vector indicating the buffer distance by which exposed units and unexposed units are matched

exp.included

an indicator of whether exposed units are matched with not only unexposed units but also other exposed units. Defalut is TRUE. If FALSE, exposed units are matched with only unexposed units. See details

long

a character string indicating the name of the longitude variable of observation units

lat

a character string indicating the name of the latitude variable of observation units

PS.method

a character string or a vector of variable names, indicating the method of propensity score estimation. Options include "mgcv.GAM" (Generalized additive model in mgcv package), "xgboost" (extreme gradient boosting in xgboost package), and "xgboost.cv" (xgboost with cross-validation). Default="mgcv.GAM"

PS.formula

a character string indicating the formula of propensity score estimation. For PSmethod="mgcv.GAM", this string must be like "the binary exposure variable ~ variableA+variableB+variableC+s(long,lat)". For PSmethod="xgboost", this must be a vector like c("variableA", "variableB", "variableC", "long", "lat")

PS.max_depth

(xgboost only) a numeric vector indicating maximum depth of a tree. Default=5. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.eta

(xgboost only) a numeric vector indicating the learning rate. Default=0.1. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.nthread

(xgboost only) a numeric vector indicating the number of thread. Default=1. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.eval_metric

(xgboost only) a character string indicating evaluation metrics for validation data. Default="logloss". This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.objective

(xgboost only) a character string indicating the objective function. Default="binary:logistic". This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.nrounds

(xgboost only) a numeric vector indicating the number of rounds. Default=100. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.cv.nround

(xgboost.cv only) a numeric vector indicating the maximum number of rounds. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.cv.nfold

(xgboost.cv only) a numeric vector indicating N-fold cross-validation. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.cv.min.burnin

(xgboost.cv only) a numeric vector indicating the minimum number of rounds. Default=99, which indicates the minimum number of rounds is 100. This must be smaller than PS.early_stopping_rounds.

PS.early_stopping_rounds

(xgboost.cv only) a numeric vector indicating when xgboost stops. If the evaluation metric did not decrease until when (code)PS.early_stopping_rounds, xgboost stops. This saves time. Default=100.

PS.cv.objective

(xgboost.cv only) a character string indicating the objective function. Default="binary:logistic" This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.cv.max_depth

(xgboost.cv only) a numeric vector indicating maximum depth of a tree. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.cv.eta

(xgboost.cv only) a numeric vector indicating the learning rate. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.cv.nthread

(xgboost.cv only) a numeric vector indicating the number of thread. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.cv.subsample

(xgboost.cv only) a numeric vector indicating subsample ratio of the training instance. Default=0.5. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.cv.eval_metric

(xgboost.cv only) a character string indicating evaluation metrics for validation data. Default="logloss". This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.cv.colsample_bytree

(xgboost.cv only) a numeric vector indicating subsample ratio of columns when constructing each tree. Default=1. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.cv.min_child_weight

(xgboost.cv only) Default=1. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.cv.lambda

(xgboost.cv only) L2 Regularization term on weights. Default=0. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.cv.lambda_bias

(xgboost.cv only) L2 Regularization term on bias. Default=0. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.cv.alpha

(xgboost.cv only) L1 Regularization term on weights. Default=0. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.cv.scale_pos_weight

(xgboost.cv only) Default=1. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

PS.cv.local.N

(xgboost.cv only) a numeric vector indicating how many times xgboost will search (local) mini-ma of the evaluation metric function. Default=10.

CGPS.method

a character string or a vector of variable names, indicating the method of conditional propensity score estimation. Options include "mgcv.GAM", "xgboost", and "xgboost.cv". Default="mgcv.GAM".

CGPS.formula

a character string indicating the formula of conditional generalized propensity score estimation.

CGPS.max_depth

(xgboost only) a numeric vector indicating maximum depth of a tree. Default=5. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.eta

(xgboost only) a numeric vector indicating the learning rate. Default=0.1. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.nthread

(xgboost only) a numeric vector indicating the number of thread. Default=1. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.eval_metric

(xgboost only) a character string indicating evaluation metrics for validation data. Default="rmse". This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.objective

(xgboost only) a character string indicating the objective function. Default="reg:squarederror". This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.nrounds

(xgboost only) a numeric vector indicating the number of rounds. Default=100. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.cv.nround

(xgboost.cv only) a numeric vector indicating the maximum number of rounds. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.cv.nfold

(xgboost.cv only) a numeric vector indicating N-fold cross-validation. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.cv.min.burnin

(xgboost.cv only) a numeric vector indicating the minimum number of rounds. Default=99, which indicates the minimum number of rounds is 100. This must be smaller than PS.early_stopping_rounds.

CGPS.early_stopping_rounds

(xgboost.cv only) a numeric vector indicating when xgboost stoCGPS. If the evaluation metric did not decrease until when (code)CGPS.early_stopping_rounds, xgboost stops This saves time. Default=100.

CGPS.cv.objective

(xgboost.cv only) a character string indicating the objective function. Default="binary:logistic". This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.cv.max_depth

(xgboost.cv only) a numeric vector indicating maximum depth of a tree. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.cv.eta

(xgboost.cv only) a numeric vector indicating the learning rate. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.cv.nthread

(xgboost.cv only) a numeric vector indicating the number of thread. Default=1. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.cv.subsample

(xgboost.cv only) a numeric vector indicating subsample ratio of the training instance. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.cv.eval_metric

(xgboost.cv only) a character string indicating evaluation metrics for validation data. Default="rmse". This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.cv.colsample_bytree

(xgboost.cv only) a numeric vector indicating subsample ratio of columns when constructing each tree. Default=1. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.cv.min_child_weight

(xgboost.cv only) Default=1. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.cv.lambda

(xgboost.cv only) L2 Regularization term on weights. Default=0. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.cv.lambda_bias

(xgboost.cv only) L2 Regularization term on bias. Default=0. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.cv.alpha

(xgboost.cv only) L1 Regularization term on weights. Default=0. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.cv.scale_pos_weight

(xgboost.cv only) Default=1. This is a wrapper for xgboost::xgb.train. See https://xgboost.readthedocs.io/en/latest/parameter.html

CGPS.cv.local.N

(xgboost.cv only) a numeric vector indicating how many times xgboost will search (local) mini-ma of the evaluation metric function. Default=10

smethod

method a character string indicating the matching method used to conduct matching by GPS Default="nearest". Options include "nearest" (nearest neighbor matching), "nearestcaliper" (nearest neighbor caliper matching), "caliper" (caliper matching)

caliper_bw

a numeric vector indicating caliper bandwidth. Default=0.1. If method is "nearest", this parameter is ignored.

smethod.replace

an indicator of whether matching by GPS is done with replacement. Default=FALSE. If FALSE, matching is done without replacement. If FALSE, note that the output of this function may differ by the order of observation units in the original dataset. Note: Matching with replacement has not been tested for its performance.

formulaDisease

a character string indicating the formula of the disease model.

family

a character string indicating the error distribution and link function to be used in the disease model.

bs.N

a numeric vector indicating the number of bootstrapping samples. If bs.N=1, then bootstrapping is not used and bs.replace is ignored.

bs.replace

a character string indicating whether bootstrapping is done with replacement. Default=TRUE.

corrmethod

a character string indicating which correlation coefficient is to be computed. These include "Pearson" (default), "Spearman", "Polychoric", or "Polyserial". For tetrachoric use "Polychoric" and for biserial use "Polyserial". This is a wrapper for wCorr::weightedCorr

varilist

a character vector indicating variable names for which you wish to compute standardized mean difference. List variable names as a vector like c("VariableA","VariableB")

Examples

estimateATT()

HonghyokKim/CGPSspatialmatch documentation built on April 24, 2022, 9:10 p.m.