overlap_fun: This function creates an overlapping dataset

Description Usage Arguments Value References See Also Examples

Description

This function ensures that the units overlap according to the estimated gps values. The overlapping dataset depends on the number of classes n_class to subclassify on.

Usage

1
2
3
4
5
6
7
8
overlap_fun(Y,
            treat,
            treat_formula,
            data_set,
            n_class,
            treat_mod,
            link_function,
            ...)

Arguments

Y

is the the name of the outcome variable contained in data.

treat

is the name of the treatment variable contained in data.

treat_formula

an object of class "formula" (or one that can be coerced to that class) that regresses treat on a linear combination of X: a symbolic description of the model to be fitted.

data_set

is a dataframe containing Y, treat, and X.

n_class

is the number of classes to split gps into.

treat_mod

a description of the error distribution to be used in the model for treatment. Options include: "Normal" for normal model, "LogNormal" for lognormal model, "Sqrt" for square-root transformation to a normal treatment, "Poisson" for Poisson model, "NegBinom" for negative binomial model, "Gamma" for gamma model.

link_function

is either "log", "inverse", or "identity" for the "Gamma" treat_mod.

...

additional arguments to be passed to the treatment regression function

Value

overlap_fun returns a list containing the following elements:

overlap_dataset

dataframe containing overlapping data.

median_vec

a vector containing median values.

overlap_treat_result

the resulting treatment fit.

References

Schafer, J.L., Galagate, D.L. (2015). Causal inference with a continuous treatment and outcome: alternative estimators for parametric dose-response models. Manuscript in preparation.

Bia, Michela, et al. "A Stata package for the application of semiparametric estimators of dose response functions." Stata Journal 14.3 (2014): 580-604.

See Also

iptw_est, ismw_est, reg_est, aipwee_est, wtrg_est, etc. for other estimates.

t_mod, overlap_fun to prepare the data for use in the different estimates.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## Example from Schafer (2015).

example_data <- sim_data

overlap_list <- overlap_fun(Y = Y,
                  treat = T,
                  treat_formula = T ~ B.1 + B.2 + B.3 + B.4 + B.5 + B.6 + B.7 + B.8,
                  data_set = example_data,
                  n_class = 3,
                  treat_mod = "Normal")

overlapped_data <- overlap_list$overlap_dataset
summary(overlapped_data)

rm(example_data, overlap_list, overlapped_data)

Example output

      A.1                A.2                A.3                A.4          
 Min.   :-2.16090   Min.   :-3.62468   Min.   :-2.37188   Min.   :-3.49492  
 1st Qu.:-0.63657   1st Qu.:-0.65190   1st Qu.:-0.65857   1st Qu.:-0.73409  
 Median :-0.04458   Median : 0.01898   Median :-0.02908   Median : 0.03106  
 Mean   :-0.04038   Mean   :-0.01385   Mean   :-0.02268   Mean   :-0.01967  
 3rd Qu.: 0.53748   3rd Qu.: 0.66500   3rd Qu.: 0.59032   3rd Qu.: 0.61554  
 Max.   : 2.37493   Max.   : 2.83913   Max.   : 2.79272   Max.   : 3.65007  
      A.5                A.6                 A.7                 A.8           
 Min.   :-3.25750   Min.   :-3.128066   Min.   :-3.041520   Min.   :-3.218881  
 1st Qu.:-0.68608   1st Qu.:-0.636384   1st Qu.:-0.688681   1st Qu.:-0.652851  
 Median :-0.03000   Median : 0.068285   Median :-0.013420   Median :-0.008034  
 Mean   :-0.03602   Mean   : 0.006962   Mean   : 0.005488   Mean   :-0.019213  
 3rd Qu.: 0.63969   3rd Qu.: 0.634998   3rd Qu.: 0.629184   3rd Qu.: 0.631477  
 Max.   : 2.85290   Max.   : 2.812570   Max.   : 3.051767   Max.   : 3.081647  
      B.1                B.2             B.3             B.4        
 Min.   :-2.77129   Min.   :1.568   Min.   :18.09   Min.   :0.0178  
 1st Qu.:-0.46805   1st Qu.:2.066   1st Qu.:28.60   1st Qu.:0.3031  
 Median : 0.12624   Median :2.255   Median :33.33   Median :0.4296  
 Mean   : 0.07015   Mean   :2.273   Mean   :33.68   Mean   :0.4260  
 3rd Qu.: 0.67434   3rd Qu.:2.452   3rd Qu.:38.04   3rd Qu.:0.5527  
 Max.   : 1.86019   Max.   :3.703   Max.   :58.17   Max.   :0.8705  
      B.5                 B.6               B.7              B.8        
 Min.   :0.0003746   Min.   : 0.1082   Min.   :0.0000   Min.   :0.0000  
 1st Qu.:0.2490920   1st Qu.: 3.2006   1st Qu.:0.0000   1st Qu.:0.0000  
 Median :0.4845486   Median : 4.7494   Median :0.0000   Median :0.0000  
 Mean   :0.4862348   Mean   : 5.2450   Mean   :0.2344   Mean   :0.2309  
 3rd Qu.:0.7206392   3rd Qu.: 6.6311   3rd Qu.:0.0000   3rd Qu.:0.0000  
 Max.   :0.9998761   Max.   :19.2801   Max.   :1.0000   Max.   :1.0000  
       T                Y            Theta.1         Theta.2        
 Min.   : 8.441   Min.   :17.08   Min.   :32.26   Min.   :-1.70667  
 1st Qu.:11.027   1st Qu.:41.08   1st Qu.:45.56   1st Qu.:-0.43309  
 Median :11.898   Median :49.46   Median :49.57   Median :-0.03767  
 Mean   :11.927   Mean   :49.88   Mean   :49.68   Mean   :-0.01435  
 3rd Qu.:12.752   3rd Qu.:58.05   3rd Qu.:53.77   3rd Qu.: 0.39385  
 Max.   :15.685   Max.   :86.54   Max.   :66.87   Max.   : 1.99159  
 support_indices
 Min.   :1.000  
 1st Qu.:1.000  
 Median :2.000  
 Mean   :1.949  
 3rd Qu.:3.000  
 Max.   :3.000  

causaldrf documentation built on May 2, 2019, 5:14 a.m.