sim_latent_strat: Simulate Data

Description Usage Arguments Details Value Examples

View source: R/ls_functions.R

Description

Simulates the data from the latent stratification model with four strata.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
sim_latent_strat(
  n = 1e+05,
  p = 0.5,
  piA = 0.05,
  piB = 0.1,
  muA1 = 5,
  muA0 = 4,
  muB1 = 5,
  sigma = 1
)

Arguments

n

total sample size, the default value is 100000.

p

treatment proportions, the default value is 0.5.

piA

proportion of strata A, the default value is 0.05.

piB

proportion of strata B, the default value is 0.10.

muA1

mean for strata A that received treatment, Z = 1, the default value is 5.

muA0

mean for strata A that did not received treatment, Z = 0, the default value is 4.

muB1

mean for strata B that received treatment, Z = 1, the default value is 5.

sigma

variance for all strata, the default value is 1.

Details

The four strata are defined as:
A = positive under treatment and control, or always buyer.
B = positive under treatment only, or influenced buyer.
C = never positive, or never buyer.
The model assumes that those who are positive under control, also known as defiers only doesn't exist. The model also assumes that all strata share the same variance.
The outcome y of strata A and B are generated through mixture models of normal distributions. Strata A is generated using two normal distributions, one with mean muA1 and the other muA0. Strata B is generated using a normal distribution with mean muB1 and 0, representing those in strata B won't be positive without treatment. Strata C is 0 at all times.
For the data frame in the output, column z is the dummy variable for treatment. If z = 1, then the observation has received treatment. If z = 0, then the observation has not received treatment.

Value

A list containing a data frame, a numeric value, and two vectors.
The data frame data contains the outcome variable y, treatment dummy z, and strata, and the mean-centered effects-coded dummies for strata.
The numeric value ATE is the true average treatment effect.
The vector par consists of the true parameters of which the data is simulated.

Examples

1
2
3
4
5
6
7
sim = sim_latent_strat(n=10000, piA=0.2, piB=0.1, muA1=5, muA0=4.5, muB1=3, sigma=0.3)
sim$par
# a vector piA 0.2 piB 0.1 muA1 5 muA0 4.5 mUB 3 sigma 0.3
sim$data
# a data frame containing outcome variable y, treatment dummy z, and strata. The first 5000 rows have z=1.
sim$ATE
# 4.0

zthuang0422/PURM-2021-Latent-Stratification documentation built on Dec. 23, 2021, 10:12 p.m.